Several conditions on the same table - sql

I've inherited an excel file converted to a database and I try to figure out the people who went to several locations.
| Customer | email | ZIP | shop |
| John Smith | js#mail.com | 75016 | 1 |
| Mary King | mary#ymail.com | 97430 | 2 |
| John Smith | js#mail.com | 75016 | 3 |
| Ivan Turtle | ivan#mail.com | 56266 | 5 |
| Mary King | mary#ymail.com | 97430 | 5 |
| John Smith | js#mail.com | 75016 | 5 |
Eg : John Smith had been to 1, 3, 5
Mary King to 2, 3
I tried to use email as a key but can't figure out how to solve this one

assuming each row specifies a separate location,
select customer, count(*) as countOfLocation
from TABLE
group by customer
having count(*) > 1
order by countOfLocation desc
Is the simplest answer

I believe something like this will work for you.
Find all customers who have been to more than one shop, and then join back to get the details of each customer.
This will give back multiple rows per person, one for each shop they have been to. If all you care about is the names of customers, you can simply use the second SELECT
If you want one row per customer and a CSV of the shops, you will have to do a bit of joining.
SELECT a.*
FROM Customers a
INNER JOIN
-- Find all customers who have been to more than one shop.
(SELECT email
FROM Customers
GROUP BY email
HAVING COUNT(shop) > 1) b
ON a.email = b.email

Try with this if is Sql server:
select customer, count(distinct(shop))
from table
group by customer, shop
having count(distinct(shop)) > 1
or, you only want to know how many places they visited:
select customer, count(distinct(shop))
from table
group by customer, shop

if you want to show the user name and shop seperating by comma in one row use this
SELECT Customer
,STUFF((SELECT ', ' + CAST(shop AS VARCHAR(10)) [text()]
FROM 'ur table'
WHERE customer = t.customer
FOR XML PATH(''), TYPE)
.value('.','NVARCHAR(MAX)'),1,2,' ') List_Output
FROM 'ur table' t
GROUP BY customer

Related

In a query (no editing of tables) how do I join data without any similarities?

I Have a query that finds a table, here's an example one.
Name |Age |Hair |Happy | Sad |
Jon | 15 | Black |NULL | NULL|
Kyle | 18 |Blonde |YES |NULL |
Brad | 17 | Blue |NULL |YES |
Name and age come from one table in a database, hair color comes from a second which is joined, and happy and sad come from a third table.My goal would be to make the first line of the chart like this:
Name |Age |Hair |Happy |Sad |
Jon | 15 |Black |Yes |Yes |
Basically I want to get rid of the rows under the first and get the non NULL data joined to the right. The problem is that there is no column where the Yes values are on the Jon row, so I have no idea how to get them there. Any suggestions?
PS. With the data I am using I can't just put a 'YES' in the 'Jon' row and call it a day, I would need to find the specific value from the lower rows and somehow get that value in the boxes that are NULL.
Do you just want COALESCE()?
COALESCE(Happy, 'Yes') as happy
COALESCE() replaces a NULL value with another value.
If you want to join on a NULL value work with nested selects. The inner select gets an Id for NULLs, the outer select joins
select COALESCE(x.Happy, yn_table.description) as happy, ...
from
(select
t1.Happy,
CASE WHEN t1.Happy is null THEN 1 END as happy_id
from t1 ...) x
left join yn_table
on x.xhappy_id = yn_table.id
If you apply an ORDER BY to the query, you can then select the first row relative to this order with WHERE rownum = 1. If you don't apply an ORDER BY, then the order is random.
After reading your new comment...
the sense is that in my real data the yes under the other names will be a number of a piece of equipment. I want the numbers of the equipment in one row instead of having like 8 rows with only 4 ' yes' values and the rest null.
... I come to the conclusion that this a XY problem.
You are asking about a detail you think will solve your problem, instead of explaining the problem and asking how to solve it.
If you want to store several pieces of equipment per person, you need three tables.
You need a Person table, an Article table and a junction table relating articles to persons to equip them. Let's call this table Equipment.
Person
------
PersonId (Primary Key)
Name
optional attributes like age, hair color
Article
-------
ArticleId (Primary Key)
Description
optional attributes like weight, color etc.
Equipment
---------
PersonId (Primary Key, Foreign Key to table Person)
ArticleId (Primary Key, Foreign Key to table Article)
Quantity (optional, if each person can have only one of each article, we don't need this)
Let's say we have
Person: PersonId | Name
1 | Jon
2 | Kyle
3 | Brad
Article: ArticleId | Description
1 | Hat
2 | Bottle
3 | Bag
4 | Camera
5 | Shoes
Equipment: PersonId | ArticleId | Quantity
1 | 1 | 1
1 | 4 | 1
1 | 5 | 1
2 | 3 | 2
2 | 4 | 1
Now Jon has a hat, a camera and shoes. Kyle has 2 bags and one camera. Brad has nothing.
You can query the persons and their equipment like this
SELECT
p.PersonId, p.Name, a.ArticleId, a.Description AS Equipment, e.Quantity
FROM
Person p
LEFT JOIN Equipment e
ON p.PersonId = e.PersonId
LEFT JOIN Article a
ON e.ArticleId = a.ArticleId
ORDER BY p.Name, a.Description
The result will be
PersonId | Name | ArticleId | Equipment | Quantity
---------+------+-----------+-----------+---------
3 | Brad | NULL | NULL | NULL
1 | Jon | 4 | Camera | 1
1 | Jon | 1 | Hat | 1
1 | Jon | 5 | Shoes | 1
2 | Kyle | 3 | Bag | 2
2 | Kyle | 4 | Camera | 1
See example: http://sqlfiddle.com/#!4/7e05d/2/0
Since you tagged the question with the oracle tag, you could just use NVL(), which allows you to specify a value that would replace a NULL value in the column you select from.
Assuming that you want the 1st row because it contains the smallest age:
- wrap your query inside a CTE
- in another CTE get the 1st row of the query
- in another CTE get the max values of Happy and Sad of your query (for your sample data they both are 'YES')
- cross join the last 2 CTEs.
with
cte as (
<your query here>
),
firstrow as (
select name, age, hair from cte
order by age
fetch first row only
),
maxs as (
select max(happy) happy, max(sad) sad
from cte
)
select f.*, m.*
from firstrow f cross join maxs m
You can try this:
SELECT A.Name,
A.Age,
B.Hair,
C.Happy,
C.Sad
FROM A
INNER JOIN B
ON A.Name = B.Name
INNER JOIN C
ON A.Name = B.Name
(Assuming that Name is the key columns in the 3 tables)

SQL Server query - don't want multiple rows with identical data

I have a SQL Server database that has the following three tables - this is simplified for this post.
Stakeholder table (a table that stores a persons personal data... name, address city, state, zip, etc)
Stakeholder_id full_name
---------------------------------------
1 Joe Stakeholder
2 Eric Stakeholder
SH Inquiry table (a table that stores information about when a stakeholder contacts us)
sh_inquiry_id inquiry_link_ID
-----------------------------------------------
1 1
2 1
3 2
Sh Contacts (a table that stores information about when we contact a stakeholder)
sh_contact_id contact_link_id
-----------------------------------------
1 1
2 1
3 2
I want to write a SQL query that shows the stakeholder information once then show all inquiries and all contacts underneath the stakeholder row? is that possible with SQL? So in this case joe stakeholder would be shown once and then there would be 4 rows next (2 inquiries and 2 contacts). Eric stakeholder would be shown once with two rows, 1 inquiry and one contact.
Thanks for any assistance in advance.
As has already been mentioned, you probably want to handle this in your application code. However, you can use a UNION query to sort of do what you want.
With the query below, I changed your latter 2 tables to SH_Inquiry and SH_Contacts (replaced spaces with underscores), which is generally a good habit (it's a bad idea to have spaces in your object names). Also, depending on how your tables are laid out, you might want to merge your Contacts and Inquiry tables (e.g. have one table, with a contact_type field that identifies it as "inbound" or "outbound").
Anyways, using a CTE and union:
WITH Unionized AS
(
SELECT
stakeholder_id,
full_name,
NULL AS contact_or_inquiry,
NULL AS contact_or_inquiry_id
FROM Stakeholder
UNION ALL
SELECT
inquiry_link_id AS stakeholder_id,
NULL AS full_name,
'inquiry' AS contact_or_inquiry,
sh_inquiry_id AS contact_or_inquiry_id
FROM SH_Inquiry
UNION ALL
SELECT
contact_link_id AS stakeholder_id,
NULL AS full_name,
'contact' AS contact,
sh_contact_id AS contact_or_inquiry_id
FROM SH_Contacts
)
SELECT
full_name,
contact_or_inquiry,
contact_or_inquiry_id
FROM Unionized
ORDER BY
stakeholder_id,
contact_or_inquiry,
contact_or_inquiry_id
giving you these results:
+------------------+--------------------+-----------------------+
| full_name | contact_or_inquiry | contact_or_inquiry_id |
+------------------+--------------------+-----------------------+
| Joe Stakeholder | NULL | NULL |
| NULL | contact | 1 |
| NULL | contact | 2 |
| NULL | inquiry | 2 |
| Eric Stakeholder | NULL | NULL |
| NULL | contact | 3 |
| NULL | inquiry | 1 |
| NULL | inquiry | 3 |
+------------------+--------------------+-----------------------+

More efficient way to query shortest string value associated with each value in another column in Hive QL

I have a table in Hive containing store names, order IDs, and User IDs (as well as some other columns including item ID). There is a row in the table for every item purchased (so there can be more than one row per order if the order contains multiple items). Order IDs are unique within a store, but not across stores. A single order can have more than one user ID associated with it.
I'm trying to write a query that will return a list of all stores and order IDs and the shortest user ID associated with each order.
So, for example, if the data looks like this:
STORE | ORDERID | USERID | ITEMID
------+---------+--------+-------
| a | 1 | bill | abc |
| a | 1 | susan | def |
| a | 2 | jane | abc |
| b | 1 | scott | ghi |
| b | 1 | tony | jkl |
Then the output would look like this:
STORE | ORDERID | USERID
------+---------+-------
a | 1 | bill
a | 2 | jane
b | 1 | tony
I've written a query that will do this, but I feel like there must be a more efficient way to go about it. Does anybody know a better way to produce these results?
This is what I have so far:
select
users.store, users.orderid, users.userid
from
(select
store, orderid, userid, length(userid) as len
from
sales) users
join
(select distinct
store, orderid,
min(length(userid)) over (partition by store, orderid) as len
from
sales) len on users.store = len.store
and users.orderid = len.orderid
and users.len = len.len
Check out probably this will work for you, here you can achieve your goal of single "SELECT" clause with no extra overhead on SQL.
select distinct
store, orderid,
first_value(userid) over(partition by store, orderid order by length(userid) asc) f_val
from
sales;
The result will be:
store orderid f_val
a 1 bill
a 2 jane
b 1 tony
Probably rank() is the best way:
select s.*
from (select s.*, rank() over (partition by store order by length(userid) as seqnum
from sales s
) s
where seqnum = 1;

Rows which do not exist in a table

I have a lists of names John, Rupert, Cassandra, Amy, and I want to get names which are not exists in table: Cassandra, Amy
How should I write such query?
My table:
+----+--------+-----------+------+
| id | name | address | tele |
+----+--------+-----------+------+
| 1 | Rupert | Somewhere | 022 |
| 2 | John | Doe | 029 |
| 3 | Donald | Armstrong | 021 |
| 4 | Bob | Gates | 022 |
+----+--------+-----------+------+
Think in sets. You add names to a the result set with UNION ALL, you remove names from the result set with EXCEPT.
select 'John'
union all
select 'Rupert'
union all
select 'Cassandra'
union all
select 'Amy'
except
select name from mytable;
Build up a list of your names to check and do a left join to the users table:
with to_check (name) as (
values
('John'), ('Rupert'), ('Cassandra'), ('Amy')
)
select tc.name as missing_name
from to_check tc
left join the_table tt on tt.name = tc.name
where tt.name is null;
SQLFiddle example: http://sqlfiddle.com/#!15/5c4f5/1
Hope your list is in form of table lets its be table b and your original table as a
now SQL query goes like it
Select name from a where name not in (select name from b);
Think this will give you solution as per my understanding. Also if further details are required please comment.
Also its more important to search for an answer as it look like its a question from a book/Class. Please try out to find solution could have got much more information like link below
How to write "not in ()" sql query using join

Merging rows SQL - Access

I have this table on MS Access:
Name | Week | Manager | Sales
John | 201409 | Marcelo | 53
John | 201410 | Marcelo | 20
John | 201410 | Raquel | 30
John | 201411 | Raquel | 53
I have to merge Week 201410 by the max Sales and choose which Manager. After this I'd like to sum the Total Sales for this two and make like this:
Name | Week | Manager | Sales
John | 201409 | Marcelo | 53
John | 201410 | Raquel | 50
John | 201411 | Raquel | 53
Could anybody help me? I tried a lot of SQL and couldn't do nothing useful.
You can try this:
SELECT [Name], [Week], [Manager], SUM([Sales]) as Sales1
From [YourTable]
GROUP BY [Name], [Week], [Manager]
I did not test this so let me know what errors you get.
If each row had a unique identifier (Primary Key), it would be a lot simpler. However, you work with the data you have, not with the data you wish you had, so here's my circuitous way of accomplishing it. You could combine this all into one query and avoid using temporary tables; I split it out this way to make it convenient to understand, rather than being concise.
First, extract the highest Sales for each Name-Week combination:
SELECT Name, Week, MAX(Sales)
INTO #MaxSales
FROM [YourTable]
GROUP BY Name, Week
Use this information to get the Manager that you should use for each week (We use TOP 1 to resolve the case where two managers have the same sales for the same Name/Week; I'm not sure how you would want to resolve this.):
SELECT Name, Week, Manager
INTO #MaxSalesManager
FROM [YourTable]
INNER JOIN #MaxSales
ON [YourTable].Name = #MaxSales.Name
AND [YourTable].Week = #MaxSales.Week
WHERE [YourTable].Sales = #MaxSales.Sales
Now you can extract the information you need:
SELECT [YourTable].Name, [YourTable].Week, #MaxSalesManager.Manager, SUM([YourTable].Sales)
FROM [YourTable]
INNER JOIN #MaxSalesManager
ON [YourTable].Name = #MaxSalesManager.Name
AND [YourTable].Week = #MaxSalesManager.Week
GROUP BY [YourTable].Name, [YourTable].Week, #MaxSalesManager.Manager
Hope this helps!
EDIT:
Combining them all into one query:
SELECT [YourTable].Name,
[YourTable].Week,
#MaxSalesManager.Manager,
SUM([YourTable].Sales)
FROM [YourTable]
INNER JOIN
(SELECT Name, Week, Manager
FROM [YourTable]
INNER JOIN
(SELECT Name, Week, MAX(Sales)
FROM [YourTable]
GROUP BY Name, Week) AS #MaxSales
ON [YourTable].Name = #MaxSales.Name
AND [YourTable].Week = #MaxSales.Week
WHERE [YourTable].Sales = #MaxSales.Sales) AS #MaxSalesManager
ON [YourTable].Name = #MaxSalesManager.Name
AND [YourTable].Week = #MaxSalesManager.Week
GROUP BY [YourTable].Name, [YourTable].Week, #MaxSalesManager.Manager