• Skip to main content
  • Skip to primary sidebar

Datannum

Data science for small business applications

  • Home
  • Microsoft Excel (spreadsheet) forum
    • Project: Analyze NYSE Data
  • SQL forum
    • Project: Query a Digital Music Store Database
  • Tableau forum (data visualization)
    • Project: Data Visualization Project using Tableau
  • Linear Algebra forum
    • Curated resource to learn linear algebra
  • Web development
    • WordPress forum
    • Google Support

Count of a column with Group By and Order By

Data Science for Small Business Applications › Forums › SQL › Count of a column with Group By and Order By

Tagged: COUNT, GROUP BY, Order By in SQL

  • This topic is empty.
Viewing 1 post (of 1 total)
  • Author
    Posts
  • April 24, 2022 at 6:23 am #146
    Admin
    Keymaster

      Here is the problem:

      All of the questions in this quiz refer to the open source Chinook Database. Please familiarize yourself with the ER diagram in order to familiarize yourself with the table and column names in order to write accurate queries and get the appropriate answers.

      Find the cities with the most customers and rank in descending order.

      Source: https://www.coursera.org/learn/sql-for-data-science%5B/caption%5D

      The only progress I could make is with this code:

      SELECT *
      FROM customers
      ORDER BY city

      I could see for instance Berlin appearing twice. Now, these are the tasks that I believe will take me closer to the solution:

      Count the occurrence of each city in a new column. For instance, Amsterdam with a count of 1. Berlin with count 2…. Next, sort that column in descending order.

      Given this problem set after a lecture on CASE statements, there should be the usage of CASE.


      Reply


      You don’t need to use a CASE statement, just GROUP BY, COUNT and ORDER BY

      SELECT COUNT(city) AS countcity, city
      FROM customers
      GROUP BY city
      ORDER BY countcity desc

      Are there any NULLs in the city column?

      SELECT CASE
      WHEN COUNT(*) > COUNT(COMPANY)
      THEN 'COMPANY HAS NULLS'
      ELSE NULL END,
      CASE
      WHEN COUNT(*) > COUNT(PostalCode)
      THEN 'PostalCode has nulls'
      ELSE NULL END,
      CASE
      WHEN COUNT(*) > COUNT(city)
      THEN "CITY HAS NULLS"
      ELSE "NO DOES NOT HAVE NULL" END
      FROM CUSTOMERS

      Company, postal code have nulls. City does not have null.

      Now change the problem just slightly

      instead of finding the cities with the most customers and rank in descending order

      do this —

      Find the companies with the fewest customers and rank in ascending order.

      it’s important to learn the nuances of NULLs

      SELECT COUNT(company) AS countcompany, company, customerid
      FROM customers
      GROUP BY company
      ORDER BY company

      ORDER BY company

      wasn’t what asked, but it will do.
      Look at the first row of results.
      Why 0 companies with NULL, when you know there is at least one NULL company, namely for customer 59.
      By the way, having customerid in your SELECT clause but not in your GROUP BY clause causes an error. Here is the solution.

      SELECT company  -- this is what we're grouping by
           , COUNT(*)       AS number_of_rows_including_nulls
           , COUNT(company) AS number_of_non_null_values
        FROM customers
      GROUP
          BY company
      ORDER 
          BY number_of_rows_including_nulls ASC

      Make sure you understand why customerid is not included in the GROUP BY and also not included in the SELECT.

      But most important, make sure you understand what COUNT(*) is showing you, and what COUNT(company) is showing you.

      If necessary, go back to where you we were using several CASE expressions to figure out which columns have nulls.

    • Author
      Posts
    Viewing 1 post (of 1 total)
    • You must be logged in to reply to this topic.
    Log In

    Primary Sidebar

    profile for Techcosec Limited on Stack Exchange, a network of free, community-driven Q&A sites

    RSS CS50’s Introduction to Computer Science by Harvard University on Edx

    • How to print winner January 21, 2023
    • find_min function and infinite loop when a candidate not winner in first instance January 17, 2023
    • Bool variable won and print_winner function January 12, 2023
    • True condition with while loop January 6, 2023
    • How to finally display result in tabular form of choices entered by voters December 12, 2022

    Subscribe via Email

    Enter your email address to subscribe and receive notifications of new posts by email.

    This website hosted on WP Engine

    Initiative by Digital Splendid OPC Pvt. Ltd. & Techcosec Limited