A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Which one is better for an ETL tool, NumPy or pandas?

Best Answers

If you are thinking of building ETL which will scale a lot in future, then I would prefer you to look at pyspark with pandas and numpy as Spark's best friends. read more

If you’ll ever perform ETL work outside of lab conditions, you’ll need to use both. Pandas handles I/O and certain sorting tasks, for number manipulation there’s nothing as concise as numpy. read more

NumPy by itself is a fairly low-level tool, and will be very much similar to using MATLAB. pandas on the other hand provides rich time series functionality, data alignment, NA-friendly statistics, groupby, merge and join methods, and lots of other conveniences. read more

Pandas is a great data transforming tool and it has totally taken over my workflow. I've mostly used it for analysis but it could easily to ETLs. For numerical stuff it's almost always good to checkout numpy, scipy, and pandas. read more

Encyclopedia Research

Image Answers

pandas: Powerful data analysis tools for Python
Source: slideshare.net

Further Research

Python vs. R (vs. SAS)
www.analyticsvidhya.com