本書以IPython、NumPy、Pandas、Matplotlib和Scikit-Learn這5個能完成數(shù)據(jù)科學大部分工作的基礎工具為主, 從實戰(zhàn)角度出發(fā), 講授如何清洗和可視化數(shù)據(jù)、如何用數(shù)據(jù)建立各種統(tǒng)計學或機器學習模型等常見數(shù)據(jù)科學任務, 旨在讓各領域與數(shù)據(jù)處理相關的工作人員具備發(fā)現(xiàn)問題、解決問題的能力。
Preface
Part I. Jupyter: Beyond Normal Python
1. Getting Started in IPython and Jupyter
Launching the IPython Shell
Launching the Jupyter Notebook
Help and Documentation in IPython
Accessing Documentation with ?
Accessing Source Code with ??
Exploring Modules with Tab Completion
Keyboard Shortcuts in the IPython Shell
Navigation Shortcuts
Text Entry Shortcuts
Command History Shortcuts
Miscellaneous Shortcuts
2. Enhanced Interactive Features
IPython Magic Commands
Running External Code: %run
Timing Code Execution: %timeit
Help on Magic Functions: ?, %magic, and %lsmagic
Input and Output History
IPython's In and Out Objects
Underscore Shortcuts and Previous Outputs
Suppressing Output
Related Magic Commands
IPython and Shell Commands
Quick Introduction to the Shell
Shell Commands in IPython
Passing Values to and from the Shell
Shell-Related Magic Commands
3. Debugging and Profiling
Errors and Debugging
Controlling Exceptions: %xmode
Debugging: When Reading Tracebacks Is Not Enough
Profiling and Timing Code
Timing Code Snippets: %timeit and %time
Profiling Full Scripts: %prun
Line-by-Line Profiling with %lprun
Profiling Memory Use: %memit and %mprun
More IPython Resources
Web Resources
Books
Part II. Introduction to NumPy
4. Understanding Data Types in Python
A Python Integer Is More Than Just an Integer
A Python List Is More Than Just a List
Fixed-Type Arrays in Python
Creating Arrays from Python Lists
Creating Arrays from Scratch
NumPy Standard Data Types
5. The Basics of NumPy Arrays
NumPy Array Attributes
Array Indexing: Accessing Single Elements
Array Slicing: Accessing Subarrays
One-Dimensional Subarrays
Multidimensional Subarrays
Subarrays as No-Copy Views
Creating Copies of Arrays
Reshaping of Arrays
Array Concatenation and Splitting
Concatenation of Arrays
Splitting of Arrays
……
Part III. Data Manipulation with Pandas
Part IV. Visualization with Matplotlib
Part V. Machine Learning
Index