pandas_utils module¶
Utility functions for pandas:
bspd_print
: pretty-prints a data framebspd_cross_products
: generates cross-products of variablesbspd_statsdf
: makes a dataframe with columns from an array specified column names.bspd_prepareplot
: prepares a dataframe for plotting (very specific).
bspd_cross_products(df, l1, l2=None, with_squares=True)
¶
Returns a DataFrame with cross-products of the variables of df
whose names are in l1
and l2
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
DataFrame
|
any data frame |
required |
l1 |
list[str]
|
a list of names of variables that belong to |
required |
l2 |
list[str] | None
|
ibidem; |
None
|
with_squares |
bool | None
|
if |
True
|
Returns:
Type | Description |
---|---|
DataFrame
|
the data frame of cross-products with concatenated names. |
Source code in bs_python_utils/pandas_utils.py
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 |
|
bspd_prepareplot(df)
¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
DataFrame
|
any dataframe whose column names either all end in '_n' for n an integer, or none does |
required |
Returns:
Type | Description |
---|---|
DataFrame
|
a properly melted dataframe for plotting, with columns 'Sample', 'Statistic', 'Value', |
DataFrame
|
and 'Group' if there are several integers. |
Source code in bs_python_utils/pandas_utils.py
216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 |
|
bspd_print(df, s='', max_rows=None, max_cols=None, precision=None)
¶
Pretty-prints a data frame
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
DataFrame
|
any data frame |
required |
s |
str
|
an optional title string |
''
|
max_rows |
int | None
|
maximum number of rows to print (all by default) |
None
|
max_cols |
int | None
|
maximum number of columns to print (all by default) |
None
|
precision |
int | None
|
of numbers. 3 digits by default. |
None
|
Returns:
Type | Description |
---|---|
None
|
nothing. |
Source code in bs_python_utils/pandas_utils.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 |
|
bspd_statsdf(T, col_names)
¶
Make a dataframe with columns from the array(s) in T
and names from col_names
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
T |
ndarray | list[ndarray]
|
a list of n_T matrices or vectors with N rows, or a matrix or a vector with N rows |
required |
col_names |
str | list[str] | list[str | list[str]]
|
a list of n_T name objects; a name object must be a string or a list of strings, with the names for the column(s) of the corresponding T matrix |
required |
Returns:
Type | Description |
---|---|
DataFrame
|
a dataframe with the named columns. |
Source code in bs_python_utils/pandas_utils.py
127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 |
|