The Programme
Job description
- Perform web data mining, big data extraction from a variety of online sources.
- Clean, transform, and validate data for use in analytics and machine learning applications.
- Design and manage data warehouses, data lakes, and cloud-based storage solutions.
- Automate data pipelines and workflows using Python, Py Spark, and tools like Apache Airflow.
What You Will Do
- Big Data Mining: Extract and mine large-scale datasets from major e-commerce platforms in Vietnam, China, Korea, Southeast Asia, …
- Data Processing: Clean, transform raw data into structured formats suitable for analytics and machine learning.
- Data Infrastructure: Build automated pipelines and cloud solutions. (e.g., AWS, GCP…).
- Data Integration and Management: Develop data warehouses and data lakes for optimal data storage and retrieval.
- LLM Data Pipeline: Develop pipelines for Large Language Models (LLM), including RAG, LangChain, or LangGraph.
- Data Visualization: Create visualizations and reports to communicate insights effectively.
Required Skills and Abilities
- Education: Studying Computer Science, Data Science, Information Technology, or a related field.
- Technical Skills:
- Proficiency in Python and data processing libraries (e.g., Py Spark, Pandas).
- Experience with data mining tools and techniques (e.g., BeautifulSoup, Scrapy, Selenium).
- Understanding of data architecture concepts (data warehouses, data lakes, and cloud platforms).
- Familiarity with data pipeline tools (e.g., Apache Airflow) and cloud management.
- Familiarity with SQL for database management.
- Basic understanding of HTML, CSS, JavaScript, and web structures.
- Knowledge of LLM frameworks and tools (e.g., LangChain, LangGraph, RAG).
- Soft Skills:
- Strong problem-solving skills, attention to detail, and a passion for data engineering.
- Good communication skills in both Vietnamese and English.
Others:
- Duration: 3 months.
- Allowance: 3-4 million VND per month.
- Eligibility: Open to 3rd-year or final-year students.
- Career Opportunity: Potential for full-time employment after the internship.
Applications Close - 24 November 2024
About Us:
ABC Studio (Ai Bigdata Content Studio), are an innovative Korea-Vietnam AI & Bigdata company, specialized in generative AI and Bigdata engineering especially for Market Intelligence and Vision graphics industry.
Our visions are:
- the global best market data company, having global e-commerce and SNS bigdata
the leading innovative AI contents engineering company at movie (VFX) and webtoon and social marketing
- At the moment, we have partnerships with Korean companies in Webtoon, movie industry, fashion, cosmetic, food, digital marketing companies.
We are waiting for enthusiastic and talented interns who are willing to accompany our long and meaningful journey together.