【Responsibilities】
- Design, develop, document, and test advanced data systems that bring together data from disparate sources, making it available to data scientists, analysts, and other users using scripting and/or programming languages (Python, Java, etc)
- Design, develop, implement and scale data processing pipelines and data schemas according to business needs.
- Write and refine code to ensure performance and reliability of extracting data from multiple sources, integrating disparate data into a common data model, and integrate data into a target database, application, or file using efficient programming processes.; debug data pipeline and ensure the timely delivery of application;
- Manage deployment/data migration of the platform on public clouds and private clouds.
- Evaluate structured and unstructured datasets utilizing statistics, data mining, and predictive analytics to gain additional business insights.
- Independently initiates and drives projects; communicate data warehouse plans to internal stakeholders.
- Recommend process improvements to increase efficiency and reliability in ETL development.
- Constantly updates knowledge by tracking and understanding emerging data pipeline practices and solutions.
【Our Ideal Candidate】
o Bachelor's degree in Computer Science, Information Management or related field.
o 5+ years hands-on experience in the data warehouse space, custom ETL design, implementation and maintenance.
o 3years of experience building and operating large scale distributed systems or applications
o Experience in SQL or similar languages and development experience in at least one scripting language (Python preferred).
o Strong data architecture, data modeling, schema design and effective project management skills.
o Experience with large data sets and data profiling techniques.
**Please be Noted: The earliest onboard date will be at 2020 Q1