Job Description
About the Job:
π’ Company Capgemini
πΌ Role Data Analyst
π Location Hyderabad
β³ Experience 2β5 Years
π Job Type Full Time
Job Description:
Capgemini is hiring a skilled Data Analyst with expertise in Python and PySpark to join its Hyderabad team. As a global leader in technology transformation and consulting, Capgemini empowers organizations to unlock the value of data and drive innovation through AI and advanced analytics. In this role, you will work on large-scale data processing and analytics projects, helping businesses derive actionable insights from complex datasets. This is an excellent opportunity to work in a collaborative, global environment while contributing to cutting-edge data engineering and analytics solutions.
As a Data Analyst, you will be responsible for building and optimizing data pipelines, transforming raw data into structured formats, and ensuring data quality across systems. You will work with distributed computing environments such as Spark clusters, Databricks, and cloud platforms to process large volumes of data efficiently. The role requires strong programming skills in Python and PySpark, along with a deep understanding of ETL/ELT workflows and data transformation techniques. Your work will play a critical role in enabling data-driven decision-making across business functions.
Capgemini offers a dynamic work culture that promotes learning, innovation, and inclusivity. You will have the opportunity to collaborate with global teams, work on impactful projects, and continuously enhance your technical skills. With exposure to modern big data technologies and enterprise-scale systems, this role provides a strong platform for career growth in data analytics, data engineering, and AI-driven solutions.
Roles & Responsibilities:
- Develop, optimize, and maintain PySpark-based data pipelines for ETL and ELT workflows, ensuring efficient data processing and transformation.
- Process and analyze large datasets using distributed computing frameworks such as Apache Spark, Databricks, and cloud-based platforms.
- Write clean, efficient, and reusable code in Python and PySpark to support data engineering and analytics tasks.
- Perform data cleansing, validation, and normalization to ensure high data quality and reliability for downstream applications.
- Collaborate with cross-functional teams to understand data requirements and deliver solutions aligned with business objectives.
- Monitor and troubleshoot data pipelines to ensure smooth execution and timely data availability.
- Optimize performance of data processing jobs by tuning Spark configurations and improving code efficiency.
- Work with data storage systems and databases to manage structured and unstructured data effectively.
- Participate in Agile development processes, contributing to sprint planning, reviews, and continuous improvement initiatives.
- Stay updated with emerging technologies and best practices in big data, analytics, and cloud computing.
Requirements & Eligibility:
- Bachelorβs degree in Computer Science, Information Technology, Engineering, or a related field. Strong academic background is preferred.
- 2β5 years of experience in data analytics, data engineering, or related roles with hands-on experience in Python and PySpark.
- Strong proficiency in Python programming and experience working with PySpark for large-scale data processing.
- Experience with distributed computing environments such as Databricks, Spark clusters, AWS EMR, Azure Synapse, or HDInsight.
- Good understanding of ETL/ELT processes, data pipelines, and data transformation techniques.
- Familiarity with data quality management practices, including data validation, cleansing, and normalization.
- Experience working with relational and non-relational databases for data storage and retrieval.
- Strong analytical and problem-solving skills to handle complex data challenges and optimize performance.
- Good communication and collaboration skills to work effectively with global teams and stakeholders.
- Ability to adapt to fast-paced environments and continuously learn new tools and technologies.
Expected Salary:
The expected salary for a Data Analyst (Python, PySpark) at Capgemini in Hyderabad typically ranges between βΉ6 LPA to βΉ14 LPA, depending on experience, expertise in big data technologies, and proficiency in PySpark and cloud platforms. Additional benefits may include performance bonuses, health insurance, and access to learning and development programs.
π¨ Stop Scrolling β This Could Be Your Shortcut to Interviews
Most candidates apply to 100+ jobs and never hear back.
The real reason? They donβt know where recruiters are actually hiring from.
Our March Hiring PDF includes verified HR emails and hiring details from companies like:
Dentsu, IBM, HCL, PwC, LTIMindtree, Wipro, Cognizant, Deloitte, Capgemini, Amazon, TCS, Infosys, EPAM, EY, NTT Data, Tech Mahindra, Fractal, GlobalLogic, Coforge, UST and many more.
Inside youβll find:
β 200+ Fresher Job Opportunities
β 2500+ Verified HR Emails & Contacts
β Direct Hiring + Consultancy Openings
β IT & Non-IT Roles
π₯ 60+ students placed recently using these hiring leads
π Grab the March Hiring List Now: March Hiring PDF


