Mandatory Skills:
• Experience with designing ETL pipelines and distributed systems (Spark); Tools like – AWS, Databricks
• Development using NumPy, pandas and Pyspark; Expertise in SQL
• Experience with Cloud technologies like – Amazon S3, RedShift
• Analytical thinker with strong attention to details, good verbal and written communication skills
Secondary Skills:
• Worked on rest API to download data from different sources
• Familiarity with AWS, Airflow, MongoDB and Linux