Phyton
Phyton
Phyton
Python is one of the most popular programming languages used by data scientists and
AIML professionals. This popularity is due to the following key features of Python:
There are a total of 33 keywords in Python 3.7 which can change in the next
version, i.e., Python 3.8. A list of all the keywords is provided below:
Keywords in Python
tup1 = (1,”a”,True)
tup2 = (4,5,6)
Concatenation of tuples means that we are adding the elements of one tuple at the
end of another tuple.
Code
tup1=(1,"a",True)
tup2=(4,5,6)
tup1+tup2
Output
All you have to do is, use the ‘+’ operator between the two tuples and you’ll get
the concatenated result.
Code
tup1=(1,"a",True)
tup2=(4,5,6)
tup2+tup1
Output
8. How can you initialize a 5*5 numpy array with only zeroes?
Solution ->
import numpy as np
n1=np.zeros((5,5))
n1
Use np.zeros() and pass in the dimensions inside it. Since, we want a 5*5 matrix,
we will pass (5,5) inside the .zeros() method.
9. What is Pandas?
Pandas is an open source python library which has a very rich set of data
structures for data based operations. Pandas with it’s cool features fits in every
role of data operation, whether it be academics or solving complex business
problems. Pandas can deal with a large variety of files and is one of the most
important tools to have a grip on.
10. What are dataframes?
A pandas dataframe is a data structure in pandas which is mutable. Pandas has
support for heterogeneous data which is arranged across two axes.( rows and
columns).
1
2
Import pandas as pd
df=p.read_csv(“mydata.csv”)
Here df is a pandas data frame. read_csv() is used to read a comma delimited file
as a dataframe in pandas.
Code
import pandas as pd
data=["1",2,"three",4.0]
series=pd.Series(data)
print(series)
print(type(series))
Output
Code
df = pd.DataFrame({'Vehicle':['Etios','Lamborghini','Apache200','Pulsar200'],
'Type':["car","car","motorcycle","motorcycle"]})
df
Output
df.groupby('Type').count()
Output
df=pd.DataFrame()
bikes=["bajaj","tvs","herohonda","kawasaki","bmw"]
cars=["lamborghini","masserati","ferrari","hyundai","ford"]
df["cars"]=cars
df["bikes"]=bikes
df
Output
Code
import pandas as pd
bikes=["bajaj","tvs","herohonda","kawasaki","bmw"]
cars=["lamborghini","masserati","ferrari","hyundai","ford"]
d={"cars":cars,"bikes":bikes}
df=pd.DataFrame(d)
df
Output
Concat works best when the dataframes have the same columns and can be used for
concatenation of data having similar fields and is basically vertical stacking of
dataframes into a single dataframe.
Join is used when we need to extract data from different dataframes which are
having one or more common columns. The stacking is horizontal in this case.
Before going through the questions, here’s a quick video to help you refresh your
memory on Python.
18. Give the below dataframe drop all rows having Nan.
df.dropna(inplace=True)
df
Output
19. How to access the first five entries of a dataframe?
By using the head(5) function we can get the top five entries of a dataframe. By
default df.head() returns the top 5 rows. To get the top n rows df.head(n) will be
used.
21. How to fetch a data entry from a pandas dataframe using a given value in index?
To fetch a row from dataframe given index x, we can use loc.
Code
import pandas as pd
bikes=["bajaj","tvs","herohonda","kawasaki","bmw"]
cars=["lamborghini","masserati","ferrari","hyundai","ford"]
d={"cars":cars,"bikes":bikes}
df=pd.DataFrame(d)
a=[10,20,30,40,50]
df.index=a
df.loc[10]
Output
22. What are comments and how can you add comments in Python?
Comments in Python refer to a piece of text intended for information. It is
especially relevant when more than one person works on a set of codes. It can be
used to analyse code, leave feedback, and debug it. There are two types of comments
which includes:
Single-line comment
Multiple-line comment
Codes needed for adding comment
Example
d={“a”:1,”b”:2}
25. Find out the mean, median and standard deviation of this numpy array ->
np.array([1,5,3,100,4,48])
import numpy as np
n1=np.array([10,20,30,40,50,60])
print(np.mean(n1))
print(np.median(n1))
print(np.std(n1))
26. What is a classifier?
A classifier is used to predict the class of any data point. Classifiers are
special hypotheses that are used to assign class labels to any particular data
points. A classifier often uses training data to understand the relation between
input variables and the class. Classification is a method used in supervised
learning in Machine Learning.
def add(n):
return n + n number= (15, 25, 35, 45)
res= map(add, num)
print(list(res))
o/p: 30,50,70,90
ex: var=copy.copy(obj)
ex: a = 100
type(a)
o/p: int
Frozen set: They are like a set but immutable, which means we cannot modify their
values once they are created.
def square(n):
'''Takes in a number n, returns the square of n'''
return n**2
print(square.__doc__)
Ouput: Takes in a number n, returns the square of n.
53. How to Reverse a String in Python?
In Python, there are no in-built functions that help us reverse a string. We need
to make use of an array slicing operation for the same.
1
str_reverse = string[::-1]
Learn more: How To Reverse a String In Python
Code
import pandas as pd
a=[1,2,3]
b=[2,3,5]
d={"col1":a,"col2":b}
df=pd.DataFrame(d)
df["Sum"]=df["col1"]+df["col2"]
df["Difference"]=df["col1"]-df["col2"]
df
Output
pandas
2. What are the different functions that can be used by grouby in pandas ?
grouby() in pandas can be used with multiple aggregate functions. Some of which are
sum(),mean(), count(),std().
Data is divided into groups based on categories and then the data in these
individual groups can be aggregated by the aforementioned functions.
3. How to select columns in pandas and add them to a new dataframe? What if there
are two columns with the same name?
If df is dataframe in pandas df.columns gives the list of all columns. We can then
form new columns by selecting columns.
If there are two columns with the same name then both columns get copied to the new
dataframe.
Code
print(d_new.columns)
d=d_new[["col1"]]
d
Output
output
4. How to delete a column or group of columns in pandas? Given the below dataframe
drop column “col1”.
d={"col1":[1,2,3],"col2":["A","B","C"]}
df=pd.DataFrame(d)
df=df.drop(["col1"],axis=1)
df
Output
5. Given the following data frame drop rows having column values as A.
Code
d={"col1":[1,2,3],"col2":["A","B","C"]}
df=pd.DataFrame(d)
df.dropna(inplace=True)
df=df[df.col1!=1]
df
Output
6. Given the below dataset find the highest paid player in each college in each
team.
df.groupby(["Team","College"])["Salary"].max()
7. Given the above dataset find the min max and average salary of a player
collegewise and teamwise.
Code
df.groupby(["Team","College"])["Salary"].max.agg([('max','max'),('min','min'),
('count','count'),('avg','min')])
Output
Code
import pandas as pd
bikes=["bajaj","tvs","herohonda","kawasaki","bmw"]
cars=["lamborghini","masserati","ferrari","hyundai","ford"]
d={"cars":cars,"bikes":bikes}
df=pd.DataFrame(d)
a=[10,20,30,40,50]
df.index=a
df
Output