SQL Duplicated names in result - sql

I've got problem with SQL.
Here is my code:
SELECT Miss.Name, Miss.Surname, Master.Name, Master.Surname,
COUNT(Date.Id_date) AS [Dates_together]
FROM Miss, Master, Date
WHERE Date.Id_miss = Miss.Id_miss AND Date.Id_master = Master.Id_master
GROUP BY Miss.Name, Miss.Surname, Master.Name, Master.Surname
ORDER BY [Dates_together] DESC
and I've got the result:
Dorothy | Mills | James | Jackson | 28
Dorothy | Mills | Kayne | West | 28
Emily | Walters | James | Jackson | 13
Emily | Walters | Tom | Marvel | 12
Sunny | Sunday | Kayne | West | 9
and I really do not know what to change to have a result like this:
Dorothy | Mills | James | Jackson | 28
Emily | Walters | Tom | Marvel | 12
Sunny | Sunday | Kayne | West | 9
Because I don't want to to have duplicated names of master or miss in a result... :(
Can anyone help me?

It looks like your result set is correct, as you are getting the appropriate distinct combinations.

The "duplicates" are accurate, because you are querying the combinations of the Miss and Master records, not the Miss and Master records themselves. For instance, in your second result set, it doesn't capture the fact that Dorothy Mills dated Kayne West 28 times.

You don't mention which database you're working with, but if I have this correctly you're trying to determine how many times a given couple have been on a date?
I think you need to ask your self what happens if you have two people, of either sex, that share the same combination of christian and surname...
Start off with :
Select idMaster, idMiss, count(*) as datecount from [Date] group by idMaster, idDate
From there, you need to simply need to add their names to the results...
Should get you started on the right track...

Related

Pivot Data in a BigQuery Standard SQL View Definition

I'm not sure whether this is possible with some of the new BigQuery scripting capabilities, UDFs, array/string functions (or anything else!), however I simply can't figure it out.
I'm trying to write the SQL for a view in BigQuery which dynamically defines columns based on query results, similar to a pivot table in a spreadsheet/BI tool (or melt in pandas). I can do this externally in Python or hard-code it using case statements, but I'm sure that a SQL solution to this would be incredibly useful to a huge number of people.
Essentially I'm trying to write a query which would transform a table like this:
year | name | number
-----------------------
1963 | Michael | 9246
1961 | Michael | 9055
1958 | Michael | 9203
1957 | Michael | 9116
1953 | Robert | 9061
1952 | Robert | 9205
1951 | Robert | 9054
1948 | Robert | 9015
1947 | Robert | 10025
1947 | John | 9634
1946 | Robert | 9295
----------------------
SQL to generate initial example table:
SELECT year, name, number
FROM `bigquery-public-data.usa_names.usa_1910_2013`
WHERE number > 9000
ORDER BY year DESC
Into a table with the following structure:
year | John | Michael | Robert
---------------------------------
1946 | | 9,295 |
1947 | 9,634 | | 10,025
1948 | | 9,015 |
...
This then needs to be connected to downstream tools, without requiring maintenance when the data changes. I know that this is not always a great idea and that tidy form data is more universally useful, but there are still some scenarios where this behaviour is desirable.
I have seen a few solutions on here, but they all seem to involve string generation and then manually pasting the query... I can do this via the BigQuery API but am desperate to find a dynamic solution using nothing but SQL so I don't have to maintain an external function.
Thanks in advance for any pointers!

How do you conditionally order by multiple fields in postgres

My specific use case is that I want to sort a list of users by name; first name, last name. The user has a preferred name and a legal name. I want to order by the preferred name if it is present, but the legal name as fall back.
For example, given the follow table:
id | first_name | last_name | preferred_first_name | preferred_last_name
----+------------+-----------+----------------------+---------------------
9 | Ryan | Bently | Alan |
10 | Ryan | Do | Billy | Baxter
11 | Olga | Clancierz | |
12 | Anurag | Plaxty | | Henderson
13 | Sander | Cliff | Billy |
I want to sort like this:
Alan Bently
Anurag Henderson
Billy Baxter
Billy Cliff
Olga Clancierz
Normally, with just one name set of name fields I would just do this:
SELECT * from users ORDER BY users.first_name, users.last_name
What is the best way to order by preferred name fields when present, but fall back to other name fields when they are not present?
Try
ORDER BY COALESCE(users.preferred_first_name,users.first_name), users.last_name

Increment value when the field is the same

First, I'm sorry for the ambiguous title.
Here's my problem :
I'm using Access and I have this table :
+--------+-----------+
| PARENT | CHILD |
+--------+-----------+
| JOHN | TANIA |
| JOHN | ROBERT |
| JOHN | APRIL |
| HELEN | TOM |
| HELEN | GABRIELLE |
+--------+-----------+
And I would like to add a column like this with queries or VBA code :
+--------+-----------+---------+
| PARENT | CHILD | LIST |
+--------+-----------+---------+
| JOHN | TANIA | CHILD 1 |
| JOHN | ROBERT | CHILD 2 |
| JOHN | APRIL | CHILD 3 |
| HELEN | TOM | CHILD 1 |
| HELEN | GABRIELLE | CHILD 2 |
+--------+-----------+---------+
I want to do this because at the end, I want to run a cross tab query. I'm only missing that last column to create that query.
I tried to do it in a recordset, but my database starts bloating after a couple of rst.Update (I have 700k+ rows)
I created a temporary table and used UPDATE queries but it just takes too much time.
I think there might be a SQL code that would do what I need, but I just can't figure it out. I hope you could help me, thanks :)
You can do something like the below, but it would be much better with some sort of IDs:
SELECT Parent.PARENT,
Parent.CHILD,
(SELECT Count(*)
FROM Parent p
WHERE p.Parent=Parent.Parent
AND p.Child<=Parent.Child) AS ChildNo
FROM Parent
ORDER BY Parent.PARENT, Parent.CHILD;
Parent is the name of the table.

Counting fields in a group by, and generating a greport with ms access

So I have this table
City | Status | District | Revenue
------------------------------------------
Oakland | Executed | North | $9.50
Los Angeles| Cancelled| South | $0.05
Oakland | Executed | North | $0.99
Oakland | Cancelled| North | $98.40
Sacramento | Executed | North | $43.50
Sacramento | Cancelled| North | $5.40
Los Angeles| Cancelled| South | $5.30
So I need this report that reads like this:
North District | Executed | Cancelled | Revenue
--------------------------------------------------------
Oakland | 2 | 1 | Sum of revenue
Sacramento | 1 | 1 | Sum of revenue
--------------------------------------------------------
South District | Executed | Cancelled | Revenue
--------------------------------------------------------
Los Angeles | 0 | 2 | Sum of revenue
But I'm stuck on how to create a query that groups and counts instances of specific values inside that group.
I mean I know syntax of group statements and count statements, but the counting a specific number of instances of a row inside a group seems pretty different than a regular count.
Can anyone guide me in the right direction? I'm not asking anyone to do my work (this isn't even a full sample of what I have to do) but if someone can help me with a statement that groups and counts specific rows in the group, with a SQL statement or an Access function, that would be awesome. From there I'd be able to figure out everything else.
Hey I ran across an answer actually. I just had to use Sum(IIF()) and it worked correctly.
SELECT
Test.City,
=Sum(IIf(Status="Cancelled",1,0))
FROM Test
Group BY Test.City

How to add column with the value of another dimension?

I appologize if the title does not make sense. I am trying to do something that is probably simple, but I have not been able to figure it out, and I'm not sure how to search for the answer. I have the following MDX query:
SELECT
event_count ON 0,
TOPCOUNT(name.children, 10, event_count) ON 1
FROM
events
which returns something like this:
| | event_count |
+---------------+-------------+
| P Davis | 123 |
| J Davis | 123 |
| A Brown | 120 |
| K Thompson | 119 |
| R White | 119 |
| M Wilson | 118 |
| D Harris | 118 |
| R Thompson | 116 |
| Z Williams | 115 |
| X Smith | 114 |
I need to include an additional column (gender). Gender is not a metric. It's just another dimension on the data. For instance, consider this query:
SELECT
gender.children ON 0,
TOPCOUNT(name.children, 10, event_count) ON 1
FROM
events
But this is not what I want! :(
| | female | male | unknown |
+--------------+--------+------+---------+
| P Davis | | | 123 |
| J Davis | | 123 | |
| A Brown | | 120 | |
| K Thompson | | 119 | |
| R White | 119 | | |
| M Wilson | | | 118 |
| D Harris | | | 118 |
| R Thompson | | | 116 |
| Z Williams | | | 115 |
| X Smith | | | 114 |
Nice try, but I just want three columns: name, event_count, and gender. How hard can it be?
Obviously this reflects lack of understanding about MDX on my part. Any pointers to quality introductory material would be appreciated.
It's important to understand that in MDX you are building sets of members on each axis, and not specifying column names like a tabular rowset. You are describing a 2-dimensional grid of results, not a linear rowset. If you imagine each dimension as a table, the member set is the set of unique values from a single column in that table.
When you choose a Measure as the member (as in your first example), it looks as if you're selecting from a table, so it's easy to misunderstand. When you choose a Dimension, you get many members, and a cross-join between the rows and columns (which is sparse in this case because the names and genders are 1-to-1).
So, you could crossjoin these two dimensions on a single axis, and then filter out the null cells:
SELECT
event_count ON 0,
TOPCOUNT(
NonEmptyCrossJoin(name.children, gender.children),
10,
event_count) ON 1
FROM
events
Which should give you results that have a single column (event_count) and 10 rows, where each row is composed of the tuple (name, gender).
I hope that sets you on the right path, and please feel free to ask you want me to clarify.
For general introductory material, I think the book "MDX Solutions" is a good place to start:
http://www.amazon.ca/MDX-Solutions-Microsoft-Analysis-Services/dp/0471748080/
For an online MDX introductory material, you can have a look to this gentle introduction that presents the main MDX concepts.