How to concat all of a select sql query - sql

So I have an access 2000 database and i want to write a sql query that would do one SELECT query and based on an id of each row returned in that SELECT query call another nested SELECT query that would concat all those results and the id are linked as a relationship so i just need to concat all the results of the nested second select query
so if the databases are like this...
Table 1 Table 2
|ID | First Name| |ID | Notes|
----------------- ------------
|1 | Mike | |1 | testing|
|2 | Alex | |1 | test2 |
|3 | Jon | |2 | testing|
so when the query is called it returns
1 mike testing test2
2 alex testing
3 jon

A LEFT JOIN or INNER JOIN, such as can be built in the query design window is only going to get you so far. It seems from the above that you also wish to concatenate several rows in table 2 when the id is the same. This cannot be done with Access (Jet) SQL. You will need a user defined function (UDF). You will find two examples here and a search for concatenate + Access should return others.

Related

Technique for querying date based log data

I have date based log data for financial records. Every time the record changes, a new copy of the record is made in the database.
The current method I am using, which I describe below, is both complex and poor performing. I am dealing with millions of rows and lots of log tables.
The logs are tables in my database that mimic the table we are logging with the addition of a unique log identifier and log date.
For instance, database table RecordLog looks like this:
LogId | RecordId | Log Date | Record Data
--------------------------------------------------------
1 |1 | 2019-07-02 | ...
2 |1 | 2019-05-12 | ...
3 |1 | 2019-03-22 | ...
4 |1 | 2019-01-01 | ...
5 |1 | 2018-08-01 | ...
6 |2 | 2018-01-01 | ...
7 |3 | 2019-01-01 | ...
8 |3 | 2019-02-15 | ...
9 |3 | 2018-10-15 | ...
-The LogId is the log unique id for the RecordLog table, while the RecordId references the unique identifier on the Record table.
-The Record data would mimic the rest of the Record table.
A lot of reporting|analytics occurs based on point in time. For instance, the user wants know the state of affairs at 2019-01-02
In that case we would get these rows since they are the closest recorded instances <= 2019-01-02:
LogId | RecordId | Log Date | Record Data
--------------------------------------------------------
4 |1 | 2019-01-01 | ...
6 |2 | 2018-01-01 | ...
7 |3 | 2019-01-01 | ...
In order to perform these queries now, I am utilizing an inner query.
select * from RecordLog where
...
and ...
and ...
and RecordLog.LogId in (
select max(InnerRecordLog.LogId) from RecordLog as InnerRecordLog
where InnerRecordLog.LogDate <= ?
group by InnerRecordLog.RecordId
order by InnerRecordLog.LogDate desc
)
One of the challenges is I am using HQL to write these queries which limits my access to some native db options
Postgres has a great extension called distinct on which is perfected suited for this:
select distinct on (lr.recordid) rl.*
from recordlog rl
where rl.logdate <= '2019-01-02'
order by lr.recordid, rl.logdate desc;
distinct on (as used here) returns one record per recordid (the keys in parentheses). The specific record is the latest logdate record -- but subject to the where conditions, of course.
In other databases, the most efficient method is usually a correlated subquery:
select rl.*
from recordlog rl
where rl.logdate = (select max(rl2.logdate)
from recordlog rl2
where rl2.recordid = rl.recordid and
rl2.logdate <= '2019-01-02'
);

Select a large number of ids from a Hive table

I have a large table with format similar to
+-----+------+------+
|ID |Cat |date |
+-----+------+------+
|12 | A |201602|
|14 | B |201601|
|19 | A |201608|
|12 | F |201605|
|11 | G |201603|
+-----+------+------+
and I need to select entries based on a list with around 5000 thousand IDs. The straighforward way would be to use the list as a WHERE clause but that would have a really bad performance and probably it even would not work. How can I do this selection?
Using a partitioned table things run fast. Once you partitioned the table add your ids into the where.
You can also extract a subtable from the original one selecting all the rows which have their ids between the min and the max of you ids list.

Column 'Course.Course_Name' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause

I need to link two tables columns, please help me. This my code:
SELECT Student.Stu_Course_ID, Course.Course_Name, COUNT(Student.Stu_ID) AS NoOfStudent FROM Student
INNER JOIN Course
ON Student.Stu_Course_ID=Course.Course_ID
GROUP BY Stu_Course_ID;
This is my course table:
__________________________________________
|Course_ID | Course_Name |
|1 | B.Eng in Software Engineering |
|2 | M.Eng in Software Engineering |
|3 | BSC in Business IT |
I got number of students from student table
_____________________________
|Stu_Course_ID | NoOfStudents |
|1 | 30 |
|2 | 12 |
|3 | 20 |
This is what i want
____________________________________________________________
|Stu_Course_ID | Course_Name | NoOfStudents|
|1 | B.Eng in Software Engineering | 30 |
|2 | M.Eng in Software Engineering | 12 |
|3 | BSC in Business IT | 20 |
You need to add Course.Course_Name to your group by clause:
SELECT Student.Stu_Course_ID,
Course.Course_Name,
COUNT(Student.Stu_ID) AS NoOfStudent
FROM Student
INNER JOIN Course
ON Student.Stu_Course_ID=Course.Course_ID
GROUP BY Student.Stu_Course_ID, Course.Course_Name;
Imagine the following simple table (T):
ID | Column1 | Column2 |
----|---------+----------|
1 | A | X |
2 | A | Y |
Your query is similary to this:
SELECT ID, Column1, COUNT(*) AS Count
FROM T
GROUP BY Column1;
So, you know you have 2 records for A in column1, so you expect a count of 2, however, you are also selecting ID, there are two different values for ID where Column1 = A, so the following result:
ID | Column1 | Count |
----|---------+----------|
1 | A | 2 |
Is no more or less correct than
ID | Column1 | Count |
----|---------+----------|
2 | A | 2 |
This is why ID cannot be contained in the select list, unless it included in the group by clause, or as part of an aggregate function.
For what it's worth, if Course_ID is the primary key in the table Course then following query is legal according to the SQL Standard, and will work in Postgresql, and I suspect at some point Microsoft will build this functionality into SQL Server too:
SELECT Course.Course_ID,
Course.Course_Name,
COUNT(Student.Stu_ID) AS NoOfStudent
FROM Student
INNER JOIN Course
ON Student.Stu_Course_ID=Course.Course_ID
GROUP BY Course.Course_ID;
The reason for this is that since Course.Course_ID is the primary key of Course there can be no duplicates of this in the table, therefore there can only be one value for Course_name for each Course_ID
give columns names after group by statements which you want to retreive so you have to also give Course.Course_Name as well...

SQL Group by one column, count entries in another

I'm using a sqlite3 database, with a table like this.
|name |action |
-------------------------
|john |run |
|jim |run |
|john |run |
|john |jump |
|jim |jump |
|jim |jump |
|jim |dive |
I want to get an output like this
|name |run |jump |dive |
---------------------------------
|john |2 |1 |0 |
|jim |1 |2 |1 |
The closest I've come is with this, but I would like to have a single row like above.
SELECT name, action, COUNT(name)
FROM table
GROUP BY name, action
|name |action |COUNT(name) |
|john |run |2 |
|john |jump |1 |
|jim |run |1 |
|jim |jump |2 |
|jim |dive |1 |
Also, I will need to have some WHERE statements in the query as well.
Am I up in the night thinking this will work?
You can also accomplish what you want by using a sum aggregate and CASE conditions like this:
SELECT name,
sum(CASE WHEN action = 'run' THEN 1 END) as run,
sum(CASE WHEN action = 'jump' THEN 1 END) as jump,
sum(CASE WHEN action = 'dive' THEN 1 END) as dive
FROM table
GROUP BY name
You will still have to change the query every time additional actions are added.
What you are trying to do is called cross tabulation. Normally this is available as a feature called pivot table in Excel and other spreadsheet softwares.
I have found a blog article which will help you with this using SQL. Check out pivot-table-hack-in-sqlite3-and-mysql
I don't know SQLLite that well, but I image that you could use subqueries or temp tables.
With mssql you could write something like this:
select Name,
(select count(*) from table as t1 where t1.Name = table.Name and t1.Action = 'run') as Run,
(select count(*) from table as t1 where t1.Name = table.Name and t1.Action = 'dive') as dive,
(select count(*) from table as t1 where t1.Name = table.Name and t1.Action = 'jump') as run
from table
But this would need to be rewritten every time you ad another action type. You should probably add an index to get the speed up on the table. But check the query plan with "real" data first.
in oracle database you can write like below query to show required solution :-
select * from table_name
pivot (count(*) for action in ('run','jump','drive'))
this will give the desired output..

complex'ish SQL joins across multiple tables with multiple conditions across all tables

Given the following tables:
labels tags_labels
|id |name | |url |labelid |
|-----|-------| |/a/b |1 |
|1 |punk | |/a/c |2 |
|2 |ska | |/a/b |3 |
|3 |stuff | |/a/z |4 |
artists tags
|id |name | |url |artistid |albumid |
|----|--------| |------|-----------|---------|
|1 |Foobar | |/a/b |1 |2637 |
|2 |Barfoo | |/a/z |2 |23 |
|3 |Spongebob| |/a/c |1 |32 |
I would like to get a list of urls that match a couple of conditions (which can be entered by the user into the script that uses these statements).
For example, the user might want to list all urls that have the labels "(1 OR 2) AND 3", but only if they are by the artists "Spongebob OR Whatever".
Is it possible to do this within a single statement using inner/harry potter/cross/self JOINs?
Or would I have to spread the query across multiple statements and buffer the results inside my script?
Edit:
And if it is possible, what would the statement look like? :p
Yes, you can do this in one query. And maybe an efficient way would be to dynamically generate the SQL statement, based on the conditions the user entered.
This query would allow you to filter by label name or artist name.
Building the sql dynamically to concatenate the user parameters or
passing the desired parameters into a stored procedure would obviously change
the where clauses but that really depends on how dynamic your 'script' must be...
SELECT tl.url
FROM labels l INNER JOIN tags_labels tl ON l.id = tl.labelid
WHERE l.name IN ('ska','stuff')
UNION (
SELECT t.url
FROM artists a INNER JOIN tags t ON a.id = t.artistid
WHERE a.name LIKE '%foo%'
)
Good Luck!