Advanced NumPy: Fancy Indexing, Statistics, and Performance


For serious data analysis, you'll need to go beyond simple slicing and use more powerful indexing and statistical tools.
Fancy Indexing: Your Ultimate Filter
Fancy indexing allows you to select elements using another array of indices. This is perfect for filtering data based on conditions.
Boolean Indexing: Use a boolean array (an array of True/False values) to filter.
python
data = np.array([10, 20, 30, 40, 50])
# Create a boolean mask
filter_mask = data > 30
print(filter_mask)
# Output: [False False False  True  True]

# Use the mask to get filtered data
filtered_data = data[filter_mask]
print(filtered_data)
# Output: [40 50]
Use code with caution.
Integer Indexing: Use an array of integers to pick specific elements.
python
data = np.array(['A', 'B', 'C', 'D', 'E'])
indices = np.array([0, 2, 4]) # Select elements at these positions
selected_elements = data[indices]
print(selected_elements)
# Output: ['A' 'C' 'E']

Statistical Power at Your Fingertips
NumPy has a huge library of functions for statistical analysis, allowing you to perform calculations on entire arrays with a single line of code.
python
data = np.array([[10, 15, 20], [25, 30, 35]])

# Get the mean of the entire array
print("Mean:", np.mean(data)) # Output: 22.5

# Get the mean of each column (axis=0)
print("Mean of columns:", np.mean(data, axis=0)) # Output: [17.5 22.5 27.5]

# Get the mean of each row (axis=1)
print("Mean of rows:", np.mean(data, axis=1)) # Output: [15. 30.]

Performance Tips: Views vs. Copies
Understanding whether an operation returns a "view" or a "copy" of an array is critical for writing efficient code and avoiding bugs.
  • View: Slicing creates a view, which means it's a new way of looking at the same data. If you modify the view, you modify the original array.
  • Copy: Fancy indexing creates a copy. If you modify the copy, the original array is unaffected.
Example: View vs. Copy
python
original_array = np.arange(5)
view_slice = original_array[0:3]
copy_fancy = original_array[[0, 1, 2]]

# Modify the view
view_slice[0] = 99
print("Original after view modification:", original_array)
# Output: [99  1  2  3  4]

# Modify the copy
copy_fancy[0] = 100
print("Original after copy modification:", original_array)
# Output: [99  1  2  3  4] (no change)

Using .copy() is a good practice if you want to ensure you're working with an independent copy of your data.

Comments

Popular Posts