Pandasmohali
Pandasmohali
1. What is Pandas?
Pandas is an open-source data analysis and manipulation library in Python. It provides data
structures like Series (1D) and DataFrame (2D) to handle structured data efficiently.
python
CopyEdit
import pandas as pd
data = {'Name': ['Alice', 'Bob'], 'Age': [25, 30]}
df = pd.DataFrame(data)
● DataFrame: A two-dimensional table with rows and columns, where each column can
have a different data type.
python
CopyEdit
df = pd.read_csv('file.csv')
🟡 Intermediate Level
6. How do you select a column from a DataFrame?
python
CopyEdit
df['column_name']
python
CopyEdit
filtered_df = df[df['column_name'] > 10]
10. What is the difference between apply() and map()?
python
CopyEdit
df.sort_values(by='column_name', ascending=False)
You can add a new column by assigning a value to a new column name:
python
CopyEdit
df['new_column'] = [value1, value2, value3]
🔵 Advanced Level
13. What are multi-indexes in Pandas, and why are they used?
The groupby() function is used to group data based on a column and then apply aggregation
or transformation functions to the grouped data.
15. How do you merge/join DataFrames in Pandas?
python
CopyEdit
merged_df = pd.merge(df1, df2, on='common_column', how='inner')
● append(): Used to add rows to a DataFrame, but it is less efficient than concat().
python
CopyEdit
df_pivot = df.pivot(index='col1', columns='col2', values='col3')
You can use pd.to_datetime() to convert a column to datetime type, and use time-based
indexing and resampling:
python
CopyEdit
df['date'] = pd.to_datetime(df['date'])
df.set_index('date', inplace=True)
df.resample('D').sum() # Resample data by day
The query() function allows you to filter data using a string expression:
python
CopyEdit
df.query('column_name > 10')
python
CopyEdit
df['moving_avg'] = df['column_name'].rolling(window=3).mean()
python
CopyEdit
df.drop_duplicates(inplace=True)