Home

Data transformation with pandas vs. pyspark

This blog compares "pandas" and "pyspark" on data transformation with the following points: checking dataframe size, checking unique values of a column, creating a new column, filtering, selecting a list of columns, aggregating, renaming columns, joining 2 dataframes, creating a new dataframe, creating a pivot table.

Read more

How to apply mock with python unittest module?

This blog talks about how to apply mock with python unittest module, like use "unittest.mock" to simulate the behavior of complex or real objects, configure your mock instance with "return_value" or / and "side_effect", check how you called a method with assertions and mock an object with "patch()".

Read more