If you are thinking of building ETL which will scale a lot in future, then I would prefer you to look at pyspark with pandas and numpy as Spark's best friends. read more
If you’ll ever perform ETL work outside of lab conditions, you’ll need to use both. Pandas handles I/O and certain sorting tasks, for number manipulation there’s nothing as concise as numpy. read more
NumPy by itself is a fairly low-level tool, and will be very much similar to using MATLAB. pandas on the other hand provides rich time series functionality, data alignment, NA-friendly statistics, groupby, merge and join methods, and lots of other conveniences. read more
Pandas is a great data transforming tool and it has totally taken over my workflow. I've mostly used it for analysis but it could easily to ETLs. For numerical stuff it's almost always good to checkout numpy, scipy, and pandas. read more