Transcription of: Pandas Unique Values | Python Pandas Tutorial #11 | Pandas Unique and Nunique Functions




hey and welcome back to my channel in this video you'll learn all about working with unique values and pandas including how to identify unique values count them as well as how to work with unique values across different columns let's get started if you're new to my channel consider subscribing and be notified of when i release new videos just like this one if you haven't seen the rest of this series check out the link above to learn more about pandas and how to use it for data analysis in python all right let's write some code i've already written down some code here which imports pandas and assigns it to the alias pd here we're creating a data frame using a dictionary where we'll have three different columns called a b and c which each contain different values let's run this code and then print the data frame to see what it looks like so like i mentioned before we do have three columns a b and c each containing different values ranging from one through five now finding unique values in pandas is extremely easy as there's a dot unique method built directly into it the unique method is applied directly to a particular column meaning that we can access unique values within a pandas column by accessing the column directly and then applying the unique method to it so for example if we run this we can see that it returns an array object with the numbers 1 and 2 in it because only the values 1 and 2 are unique in the column a now you could also write this not using the dictionary but using the dot notation by writing df dot a dot unique and you would return the exact same thing now in itself having this array returned isn't necessarily the most useful way to conduct your analysis further say you wanted to be able to draw on that say as a list you'd be able to turn this array directly into a list by using the to list method so again we've written here df and accessed the a column we've then created the array using the unique method then we apply another method to the array directly which is the to list method when we return this we can see that we've now generated a list that contains the same values as the array but in list format you can accomplish the exact same thing as we did using the to list method by using the list function so instead of chaining the two list method to the unique method we're wrapping the entire thing in the list function so we've written list and then passing in the unique array generated by passing the unique method into the a column of data frame df and you can see here that we've returned the exact same list as before now say you wanted to know how many unique values are in each column you could do this by passing the length function directly to the unique array so for example when we run this we can see that the column a has two unique values as we know from here now we can change df a to dfb for example if we wanted to know how many unique values there are in column b if we do this we can see that there's three unique values let's quickly scroll up to make sure that this is accurate so we have the value 3 and 3 4 and 5 so there's three unique values in this column now there may be times when you want to see unique combinations of values across multiple columns in order to do this we can run the drop duplicates function applied to two specific columns you'll note here that we've actually wrapped this in double brackets to only select columns a and b when we run this what's happened is it's returned a data frame with unique values stretched across the two columns we can see here that record two has been dropped let's take a look at what record 2 looked like previously record 2 here contains the same values as index number 0 and columns a and b because of this it's been dropped so what this data frame here is telling us is that the unique combinations across columns a and b are the subset that are found here another thing you may be curious about is how often each unique value actually occurs within a unique column for this we can use the value counts method which we've covered off in a different tutorial which you can find right up here so let's run this we've applied the value counts method directly to column a of data frame df we can see here that the value 1 occurs 2 times and the value 1 occurs twice as well now let's take a look at the unique values and their counts of column b we can see that the value three occurs two times while the values five and one occur once this is really handy to get a better sense of how often different values occur within a data frame the last method we'll take a look at is the n unique method this one's a bit unique in that it returns the count of unique values across an entire data frame so when we run df.n unique what this will return is per column how many unique values exist in each column so we can see here that column a contains two unique values column b contains three while column c contains four let's scroll back up to make sure that this is accurate we've already explored that column a only contains two unique values column b contains three while column c contains four okay so you've learned quite a bit in this video you've learned how to work with unique values and pandas using the unique method as well as the n-unique method you've learned how to count unique values within a data frame as well as how to identify unique values across multiple columns if you have any questions be sure to leave them in the comments below and i'll be happy to answer them if you enjoyed this video click the like button and consider subscribing to be notified of when i release new videos just like this one thanks so much for watching and have a great day