SQL Query for Pivoting data in MS-Access - sql

I have a table with three fields: ticketNumber, attendee and tableNumber.
e.g.
ticketNumber | attendee | tableNumber
------------ | -------- | -----------
A1 | alex | 3
A2 | bret | 2
A3 | chip | 1
A4 | dale | 2
A5 | eric | 2
A6 | finn | 3
I'd like to generate a table with each tableNumber as a field and the list of names sitting at that tableNumber.
1 | 2 | 3
-----|------|-----
chip | bret | alex
| dale | finn
| eric |
The query I've been working on is:
transform attendee
select attendee
from registration
group by tableNumber, name
pivot tableNumber
This gives me:
attendee | 1 | 2 | 3
---------|------|------|-----
chip | chip | |
bret | | bret |
dale | | dale |
eric | | eric |
alex | | | alex
finn | | | finn
I know how to get the table I require using PivotTableView but I'd like to know how to do it with a query so that I can use it in a code I'm working on. I'm unsure of how I can write this query without having to select a field (in my case, select attendee). Also is it possible to generate the table without the empty cells?
Thank you :)

The table needs a unique identifier field that will properly sort and I don't think the ticketNumber can be relied on for that. An autonumber should serve. Then try:
TRANSFORM First(Table1.attendee) AS FirstOfattendee
SELECT DCount("*","Table1","tableNumber=" & [tableNumber] & " AND ID<" & [ID])+1 AS RowSeq
FROM Table1
GROUP BY DCount("*","Table1","tableNumber=" & [tableNumber] & " AND ID<" & [ID])+1
PIVOT Table1.tableNumber;

Related

Cumulative SUM in a query (SQL access)

Using MS access SQL I have a query (actually a UNION made of multiple queries) and need a cumulative sum (actually a statement of account which items are in chronological order).
How do I get a cumulative sum?
Since they are duplicates by date I have to add a new ID, however, SQL in MS access does not seem to have ROW_ID or similar.
So, we need to sort donation data into chronological order across multiple tables with duplicates. First combine all the tables of donators in one query which sets up the simplest syntax. Then to put things in order we need to have an order for the duplicate dates. The dataset has two natural ways to sort duplicate dates including the donator and the amount. For instance, we could decide that after the date bigger donations come first, If the rule is complicated enough we abstract it to a code module and into public function and include it in the query so that we can sort by it:
'Sorted Donations:'
SELECT (BestDonator(q.donator)) as BestDonator, *
FROM tblCountries as q
UNION SELECT (BestDonator(j.donator)) as BestDonator, *
FROM tblIndividuals as j
ORDER BY EvDate Asc, Amount DESC , BestDonator DESC;
Public Function BestDonator(donator As String) As Long
BestDonator = Len(donator) 'longer names are better :)'
End Function
with sorted donations we have settled on an order for the duplicate dates and have combined both individual donations and country donations, so now we can calculate the running sum directly using either dsum or a subquery. There is no need to calculate row id. The tricky part is getting the syntax correct. I ended up abstracting the running sum calculation to a function and omitting BestDonator because I couldn't easily paste together this query in the query designer and I ran out of time to bug fix
Public Function RunningSum(EvDate As Date, Amount As Currency)
RunningSum = DSum("Amount", "Sorted Donations", "(EvDate < #" & [EvDate] & "#) OR (EvDate = #" & [EvDate] & "# AND Amount >= " & [Amount] & ")")
End Function
Carefully note the OR in the Dsum part of the RunningSum calculation. This is the tricky part to summing the right amounts.
'output
-------------------------------------------------------------------------------------
| donator | EvDate | Amount | RunningSum |
-------------------------------------------------------------------------------------
| Reiny | 1/10/2020 | 321 | 321 |
-------------------------------------------------------------------------------------
| Czechia | 3/1/2020 | 7455 | 7776 |
-------------------------------------------------------------------------------------
| Germany | 3/18/2020 | 4222 | 11998 |
-------------------------------------------------------------------------------------
| Jim | 3/18/2020 | 222 | 12220 |
-------------------------------------------------------------------------------------
| Australien | 4/15/2020 | 13423 | 25643 |
-------------------------------------------------------------------------------------
| Mike | 5/31/2020 | 345 | 25988 |
-------------------------------------------------------------------------------------
| Portugal | 6/6/2020 | 8755 | 34743 |
-------------------------------------------------------------------------------------
| Slovakia | 8/31/2020 | 3455 | 38198 |
-------------------------------------------------------------------------------------
| Steve | 9/6/2020 | 875 | 39073 |
-------------------------------------------------------------------------------------
| Japan | 10/10/2020 | 5234 | 44307 |
-------------------------------------------------------------------------------------
| John | 10/11/2020 | 465 | 44772 |
-------------------------------------------------------------------------------------
| Slowenia | 11/11/2020 | 4665 | 49437 |
-------------------------------------------------------------------------------------
| Spain | 11/22/2020 | 7677 | 57114 |
-------------------------------------------------------------------------------------
| Austria | 11/22/2020 | 3221 | 60335 |
-------------------------------------------------------------------------------------
| Bill | 11/22/2020 | 767 | 61102 |
-------------------------------------------------------------------------------------
| Bert | 12/1/2020 | 755 | 61857 |
-------------------------------------------------------------------------------------
| Hungaria | 12/24/2020 | 9996 | 71853 |
-------------------------------------------------------------------------------------

Replacing multiple strings from a databsae column with distinct replacements

I have a hive table as below:
+----+---------------+-------------+
| id | name | partnership |
+----+---------------+-------------+
| 1 | sachin sourav | first |
| 2 | sachin sehwag | first |
| 3 | sourav sehwag | first |
| 4 | sachin_sourav | first |
+----+---------------+-------------+
In this table I need to replace strings such as "sachin" with "ST" and "Sourav" with "SG". I am using following query, but it is not solving the purpose.
Query:
select
*,
case
when name regexp('\\bsachin\\b')
then regexp_replace(name,'sachin','ST')
when name regexp('\\bsourav\\b')
then regexp_replace(name,'sourav','SG')
else name
end as newName
from sample1;
Result:
+----+---------------+-------------+---------------+
| id | name | partnership | newname |
+----+---------------+-------------+---------------+
| 4 | sachin_sourav | first | sachin_sourav |
| 3 | sourav sehwag | first | SG sehwag |
| 2 | sachin sehwag | first | ST sehwag |
| 1 | sachin sourav | first | ST sourav |
+----+---------------+-------------+---------------+
Problem: My intention is, when id = 1, the newName column should bring value as "ST SG". I mean it should replace both strings.
You can nest the replaces:
select s.*,
replace(replace(s.name, 'sachin', 'ST'), 'sourav', 'SG') as newName
from sample1 s;
You don't need regular expressions, so just use replace().

SQL query for many-to-many self-join

I have a database table that has a companion many-to-many self-join table alongside it. The primary table is part and the other table is alternate_part (basically, alternate parts are identical to their main part with different #s). Every record in the alternate_part table is also in the part table. To illustrate:
`part`
| part_id | part_number | description |
|---------|-------------|-------------|
| 1 | 00001 | wheel |
| 2 | 00002 | tire |
| 3 | 00003 | window |
| 4 | 00004 | seat |
| 5 | 00005 | wheel |
| 6 | 00006 | tire |
| 7 | 00007 | window |
| 8 | 00008 | seat |
| 9 | 00009 | wheel |
| 10 | 00010 | tire |
| 11 | 00011 | window |
| 12 | 00012 | seat |
`alternate_part`
| main_part_id | alt_part_id |
|--------------|-------------|
| 1 | 5 | // Wheel
| 5 | 1 | // |
| 5 | 9 | // |
| 9 | 5 | // |
| 2 | 6 | // Tire
| 6 | 2 | // |
| ... | ... | // |
I am trying to produce a simple SQL query that will give me a list of all alternates for a main part. The tricky part is: some alternates are only listed as alternates of alternates, it is not guaranteed that every viable alternate for a part is listed as a direct alternate. e.g., if 'Part 3' is an alternate of 'Part 2' which is an alternate of 'Part 1', then Part 3 is an alternate of Part 1 (even if the alternate_part table doesn't list a direct link). The reverse is also true (Part 1 is an alternate of Part 3).
Basically, right now I'm pulling alternates and iterating through them
SELECT p.*, ap.*
FROM part p
INNER JOIN alternate_part ap ON p.part_id = ap.main_part_id
And then going back and doing the same again on those alternates. But, I think there's got to be a better way.
The SQL query I'm looking for will basically give me:
| part_id | alt_part_id |
|---------|-------------|
| 1 | 5 |
| 1 | 9 |
For part_id = 1, even when 1 & 9 are not explicitly linked in the alternates table.
Note: I have no control whatever over the structure of the DB, it is a distributed software solution.
Note 2: It is an Oracle platform, if that affects syntax.
You have to create hierarchical tree , probably you have to use connect by prior , nocycle query
something like this
select distinct p.part_id,p.part_number,p.description,c.main_part_id
from part p
left join (
select main_part_id,connect_by_root(main_part_id) real_part_id
from alternate_part
connect by NOCYCLE prior main_part_id = alternate_part_id
) c
on p.part_id = c.real_part_id and p.part_id != c.main_part_id
order by p.part_id
You can read full documentation about Hierarchical queries at http://docs.oracle.com/cd/B28359_01/server.111/b28286/queries003.htm

SQL for calculated column that chooses from value in own row

I have a table in which several indentifiers of a person may be stored. In this table I would like to create a single calculated identifier column that stores the best identifier for that record depending on what identifiers are available.
For example (some fictional sample data) ....
Table = "Citizens"
Id | LastName | DL-No | SS-No | State-Id-No | Calculated
------------------------------------------------------------------------
1 | Smith | NULL | 374-784-8888 | 7383204848 | ?
2 | Jones | JG892435262 | NULL | NULL | ?
3 | Trask | TSK73948379 | NULL | 9276542119 | ?
4 | Clinton | CL231429888 | 543-123-5555 | 1840430324 | ?
I know the order in which I would like choose identifiers ...
Drivers-License-No
Social-Security-No
State-Id-No
So I would like the calculated identifier column to be part of the table schema. The desired results would be ...
Id | LastName | DL-No | SS-No | State-Id-No | Calculated
------------------------------------------------------------------------
1 | Smith | NULL | 374-784-8888 | 7383204848 | 374-784-8888
2 | Jones | JG892435262 | NULL | 4537409273 | JG892435262
3 | Trask | NULL | NULL | 9276542119 | 9276542119
4 | Clinton | CL231429888 | 543-123-5555 | 1840430324 | CL231429888
IS this possible? If so what SQL would I use to calculate what goes in the "Calculated" column?
I was thinking of something like ..
SELECT
CASE
WHEN ([DL-No] is NOT NULL) THEN [DL-No]
WHEN ([SS-No] is NOT NULL) THEN [SS-No]
WHEN ([State-Id-No] is NOT NULL) THEN [State-Id-No]
AS "Calculated"
END
FROM Citizens
The easiest solution is to use coalesce():
select c.*,
coalesce([DL-No], [SS-No], [State-ID-No]) as calculated
from citizens c
However, I think your case statement will also work, if you fix the syntax to use when rather than where.

How to tokenize a SQL Server column for use in frequency distribution in SSAS

I have a field in my table called Description . Here is an examples of a few records:
+-----------+------------+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------+------------+---------+---------+-------------------+---------------+-------------+-------------+
| RecordKey | RecordType | Price | Description | RecordNumber | DiscsinSet | Country | Company | DigitalAnalogCode | Genre | UPC | datecreated |
+-----------+------------+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------+------------+---------+---------+-------------------+---------------+-------------+-------------+
| 100488 | CD | 5.99 | Korngold, Honegger, Verdi, Wagner, Puccini, Leoncavallo, Giordano: Opera Arias + 'I Know Where I'm Going'. (Ellen Faull, soprano. Taken from the Sylvan Levin Opera Concert Broadcasts of 1951 & 1952. Total time: 65'47') | VAIA 1173 | 1 | AMERICA | VAI | M | Songs & Arias | 89948117322 | 42:38.4 |
| 100503 | CD | 11.98 | Puccini, Madama Butterfly. (Kirsten, Barioni, Nadell et al. New Orleans Opera/ Cellini. Rec.3/60) | VAIA 1054-2 | 2 | AMERICA | VAI | A | Opera | 89948105428 | 42:38.4 |
| 100516 | MV | 8.99 | Brahms, 8 Gypsy Songs. Schumann, 2 Short Gypsy Songs. Liszt, The 3 Gypsies. Verdi, The Gypsy Woman. J.Strauss, 'Gypsy Baron'- Song of Sapphi + Other Gypsy Songs by Balakirev, Varlamov, Tchaikovsky, Verstovskij, Dvorak & Lehar. (Ljuba Kazarnovskaya, soprano w.Mark Morash, piano. Rec.Moscow, 2/19/98) | 69503 | 1 | AMERICA | VAI | S | NULL | 89948695035 | 42:38.4 |
+-----------+------------+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------+------------+---------+---------+-------------------+---------------+-------------+-------------+
Apologies, if it is difficult to read, but the description field has a lot of text.
I would like to create a frequency distribution of every word in this field.
The output I would like would look something like this:
+-----------+-------+
| word | count |
+-----------+-------+
| Beethoven | 344 |
| Strauss | 34533 |
| Piano | 3 |
| Webber | 34 |
+-----------+-------+
If it makes more sense, could you point me in the right direction on how this can be achieved with SSAS?
If you have a separate list of valid words, you can just do:
select w.word, count(*)
from mytable t join
words w
on ', ' + w.word + ', ' like '%, ' + t.description + ', %'
group by w.word;
If you don't, then look around the web for a split() function. You can then use cross apply for something like:
select w.value, count(*)
from mytable t cross apply
(select *
from split(t.description, ', ')
) w
group by w.value;
If you have control over the data structure, then naughty, naughty. SQL has this wonderful data structure for storing lists. It is called a table. It is not called a string. You should be using a junction table -- if you have control. One doesn't always have control over such issues, though.