SQL 2-table query results need to be concatenated - sql

My current SQL:
SELECT B.MESSAGENO, B.LINENO, B.LINEDATA
FROM BILL.MESSAGE AS B, BILL.ACTIVITYAS A
WHERE A.MSGNO = D.MESSAGENO AND A.FUPTEAM = 'DBWB'
AND A.ACTIVITY = 'STOPPAY' AND A.STATUS = 'WAIT'
AND A.COMPANY = D.COMPANY
MESSAGENO LINENO LINEDATA
1234567 1 CHEQUE NO : 9999999 RUN NO : 55555
1234567 2 DATE ISSUED: 12/25/2020 AMOUNT : 710.51
1234567 3 PAYEE : LASTNAME, FIRSTNAME
1234567 4 ACCOUNT NO : 12345-67890
1234567 5 USER : USERNAME
there are 550 sets of 5 lines per MESSAGENO
What I am trying to figure out is how I can get something like where LINENO = 1, concatenate LINEDATA so I just get 9999999 as checkno, where LINENO = 2, concatenate LINEDATA so I get 710.51 as amount, where LINENO = 3, concatenate LINEDATA so I get LASTNAME, FIRSTNAME as payee, where LINENO = 4, concatenate LINEDATA so I get LASTNAME, FIRSTNAME as payee, and lastly, the same thing for USERNAME.
I just cannot seems to conceptualize this. Every time I try, my brain starts turning into macaroni. Any help is appreciated.

UPDATED ANSWER, extracts all fields from stored strings:
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=e22d26866053ea6088aa78dc23c4d809
Check this fiddle.
It uses a SUBSTRING_INDEX in the internal query to split the fields at the : or by a combination of : and " ". I used the two spaces because I wasn't sure what your actual whitespace was, and when I copied the data out of your post it was just spaces.
Then MAX is used in the outer query to get everything on one line, grouping by the messageNo. Since some lines have two pieces of data to extract, a second string parser was added. Here is the code from the fiddle, for reference. In this case, the table test was created from the output of OP's initial query, since I didn't have enough info to build both tables and completely recreate his output.
SELECT MESSAGENO,
MAX(if(LINENO = 1, extractFirst, null)) as checkNo,
MAX(if(LINENO = 1, extractLast, null)) as runNo,
MAX(if(LINENO = 2, extractFirst, null)) as issued,
MAX(if(LINENO = 2, extractLast, null)) as amount,
MAX(if(LINENO = 3, extractFirst, null)) as payee,
MAX(if(LINENO = 4, extractFirst, null)) as accountNo,
MAX(if(LINENO = 5, extractFirst, null)) as username
FROM (
SELECT MESSAGENO, LINENO,
trim(substring_index(substring_index(LINEDATA, ": ", -2), " ", 1)) as extractFirst,
trim(substring_index(LINEDATA, ":", -1)) as extractLast
FROM test
) z
GROUP BY MESSAGENO
Again, you will be much better off to alter your tables so that you can use simpler queries, as shown in the last fiddle.
===============================================
ORIGINAL ANSWER (demo of string parsing, suggestion for data model change)
You can achieve this with some string parsing BUT ABSOLUTELY DO NOT unless you have no choice. The reason you are having trouble is because your data shouldn't be stored this way.
I've included a fiddle incorporating this case statement and substring_index to extract the data. I have assumed mySQL 8 because you didn't specify SQL version.
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=068a49a2c819c08018691e54bcdf91e5
case when LINENO = 1 then trim(substring_index(substring_index(LINEDATA, "RUN NO", 1), ":", -1))
else trim(substring_index(LINEDATA, ":", -1)) end
as LDATA
See this fiddle for the full statement. I have just inserted the data from your join into a test table, instead of trying to recreate all your tables, since I don't have access to all the data I would need for that. In future, set up a fiddle like this one with some representative data and the SQL version, and it will be easier for people to help you.
=========================================
I think this is a better layout for you, with all data stored as the proper type and a field defined for each one and the extra text stripped out:
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=11d52189b005740cdc53175466374635

Related

How can I count all NULL values, without column names, using SQL?

I'm reading and executing sql queries from file and I need to inspect the result sets to count all the null values across all columns. Because the SQL is read from file, I don't know the column names and thus can't call the columns by name when trying to find the null values.
I think using CTE is the best way to do this, but how can I call the columns when I don't know what the column names are?
WITH query_results AS
(
<sql_read_from_file_here>
)
select count_if(<column_name> is not null) FROM query_results
If you are using Python to read the file of SQL statements, you can do something like this which uses pglast to parse the SQL query to get the columns for you:
import pglast
sql_read_from_file_here = "SELECT 1 foo, 1 bar"
ast = pglast.parse_sql(sql_read_from_file_here)
cols = ast[0]['RawStmt']['stmt']['SelectStmt']['targetList']
sum_stmt = "sum(iff({col} is null,1,0))"
sums = [sum_sql.format(col = col['ResTarget']['name']) for col in cols]
print(f"select {' + '.join(sums)} total_null_count from query_results")
# outputs: select sum(iff(foo is null,1,0)) + sum(iff(bar is null,1,0)) total_null_count from query_results

Trying to search on the last 4 of SSN (SQL)

So I have found multiple resources that let me know how to retrieve the last 4 of a SSN. But I am trying to search the database for matches of the last 4 that I am provided. I am working with Oracle btw. I've tried a few different things but below is basically what I mean:
SELECT gm.plan_name, tpam.third_prty_admin_name, pm.partcpnt_first_name,
pm.partcpnt_last_name, pm.partcpnt_id, gm.grp_id,
tpam.third_prty_admin_id,
pm.partcpnt_birth_dt,pm.partcpnt_setup_dt
FROM gva_s01.partcpnt_mstr pm,
gva_s01.grp_mstr gm,
gva_s01.third_prty_admin_mstr tpam
WHERE gm.cntrct_id = 'CNTRCTID0000'
AND gm.third_prty_admin_id = 0
AND gm.third_prty_admin_id = tpam.third_prty_admin_id
AND gm.cntrct_id = pm.cntrct_id
--This is where I can't figure it out
AND SUBSTRING(pm.partcpnt_id, -4) = '1234'
In Oracle, you want SUBSTR( string, position ) and a negative position will count from the end of the string. So:
AND SUBSTR(pm.partcpnt_id, -4) = '1234'
Or you can use LIKE:
AND pm.partcpnt_id LIKE '%1234'

LAST REG From a Query SQL

I'm trying to get the last record from this query but i don't know how to do it. I used ROW_NUMBER but my program (Protheus ADVPL) don't have resources to get the last line from a query
SELECT ROW_NUMBER() OVER (ORDER BY B1_MASTER, B1_COD) AS ID,
B1_COD,
B1_DESC,
B1_CATEG,
B1_MASTER,
A2_COMPRAD,
ISNULL((SELECT Sum(C6_QTDVEN * C6_PRCVEN)
FROM SC6010 SC6,
SF4010 SF4,
SC5010 SC5
WHERE C6_FILIAL = '01'
AND C6_PRODUTO = B1_COD
AND SC6.D_E_L_E_T_ <> '*'
AND C5_FILIAL = C6_FILIAL
AND C5_NUM = C6_NUM
AND C5_EMISSAO BETWEEN '20160401' AND '20160404'
AND C5_TIPO = 'N'
AND C5_MODAL = '2'
AND SC5.D_E_L_E_T_ <> '*'
(query have 106 lines so i ll not put everything)
I need the total records in a column, like this:
Tabela
What can i do?
Tks
You can use MAX(field) too.
But, you're using ADVPL, so you could use dbSeek instead to find the last RECNO.
So, using "work area" you can find the last record with this:
TRB->(RECCOUNT())
I changed ROW_NUMBER to ##ROWCOUNT and it works! Tks all

SQL CONCAT IF Statement?

Morning All,
Im not to sure how i need to solve my following query... I have the following query which pulls back the desired records in SQL server...
SELECT agenda.AgendaItemNumber,Agenda.AgendaName, AgendaType.AgendaTypeDescription, userdetails.fullName
FROM Agenda
JOIN AgendaType ON AgendaType.AgendaTypeID=Agenda.AgendaTypeID
JOIN UserDetails ON Agenda.AgendaID = Userdetails.AgendaID
WHERE agenda.AgendaTypeID = '2'
AND AgendaItemNumber = AgendaItemNumber
AND AgendaName = AgendaName
AND AgendaTypeDescription = AgendaTypeDescription
AND AgendaItemNumber >= '3'
The above query works but i need to enhance this slightly. It pulls back the following results, which essentially are duplicate records except for the 'fullname' column...
What i would like to do is be able to add some extra code to this query so that when i run the query i am able to display one record for each 'AgendaItemNumber' and for it to concat both of the fullnames for this record. However i have additional AgendaItemsNumbers in this table that only have 1 x user fullname assigned to them. its just these few records within the image file i need to do something clever with.
Maybe there is a better way to complete this task?
Many thanks in advance. Any queries please dont hesitate to ask.
Regards
Betty
SELECT agenda.AgendaItemNumber,
Agenda.AgendaName,
AgendaType.AgendaTypeDescription,
STUFF(( SELECT ';' + FullName
FROM UserDetails
WHERE UserDetails.AgendaID = Agenda.AgendaID
FOR XML PATH('')
), 1, 1, '') AS fullName
FROM Agenda
INNER JOIN AgendaType
ON AgendaType.AgendaTypeID=Agenda.AgendaTypeID
INNER JOIN UserDetails
ON Agenda.AgendaID = Userdetails.AgendaID
WHERE agenda.AgendaTypeID = '2'
AND AgendaItemNumber = AgendaItemNumber
AND AgendaName = AgendaName
AND AgendaTypeDescription = AgendaTypeDescription
AND AgendaItemNumber >= '3'
ADENDUM
The XML extension in SQL-Server allows you to concatenate multiple rows into a single row. The actual intention of the extension is so you can output as XML (obviously), but there are some nifty tricks that are byproducts of the extensions. In the above query, if there were a column name in the subquery (FullName) it would output as <FullName>Joe Bloggs1</FullName><FullName>Joe Bloggs2</FullName>, because there is no column name it simply concatenates the rows (not forming proper XML). The PATH part allows you to specify an additional node, for example if you use PATH('Name') in the above you would get <Name>;Joe Bloggs</Name><Name>;Joe Bloggs2</Name> If you combine Path with a column name you would get Joe Bloggs.
Finally the STUFF just removes the semicolon at the start of the list.

Creating a new table from grouped substring of existing table

I am having some trouble creating some SQL (for SQL server 2008).
I have a table of tasks that are priority ordered, comma delimited tasks:
Id = 1, LongTaskName = "a,b,c"
Id = 2, LongTaskName = "a,c"
Id = 3, LongTaskName = "b,c"
Id = 4, LongTaskName = "a"
etc...
I am trying to build a new table that groups them by the first task, along with the id:
GroupName: "a", TaskId: 1
GroupName: "a", TaskId: 2
GroupName: "a", TaskId: 4
GroupName: "b", TaskId: 3
Here is the naive, slow, linq code:
foreach(var t in Tasks)
{
var gt = new GroupedTasks();
gt.TaskId = t.Id;
var firstWord = t.LongTaskName.Split(',');
if(firstWord.Count() > 0)
{
gt.GroupName = firstWord.First();
}
else
{
gt.GroupName = t.LongTaskName;
}
GroupedTasks.InsertOnSubmit(gt);
}
I wrote a sql function to do the string split:
create function fn_Split(
#String nvarchar (4000),
#Delimiter nvarchar (10)
)
returns nvarchar(4000)
begin
declare #FirstComma int
set #FirstComma = charindex(#Delimiter,#String)
if(#FirstComma = 0)
return #String
return substring(#String, 0, #FirstComma)
end
go
However, I am getting stuck on the real sql to do the work.
I can get the group by alone:
SELECT dbo.fn_Split(LongTaskName, ',')
FROM [dbo].[Tasks]
GROUP BY dbo.fn_Split(LongTaskName, ',')
And I know I need to head down something like this:
DECLARE #RowSet TABLE (GroupName nvarchar(1024), Id nvarchar(5))
insert into #RowSet
select ???
FROM [dbo].Tasks as T
INNER JOIN
(
SELECT dbo.fn_Split(LongTaskName, ',')
FROM [dbo].[Tasks]
GROUP BY dbo.fn_Split(LongTaskName, ',')
) G
ON T.??? = G.???
ORDER BY ???
INSERT INTO dbo.GroupedTasks(GroupName, Id)
select * from #RowSet
But I am not quite groking how to reference the grouped relationships and am confused about having to call split multiple times.
Any thoughts?
If you only care about the first item in the list, there's no need really for a function. I would recommend this way. You also don't need the #RowSet table variable for any temporary holding.
INSERT dbo.GroupedTasks(GroupName, Id)
SELECT
LEFT(LongTaskName, COALESCE(NULLIF(CHARINDEX(',', LongTaskName)-1, -1), 1024)),
Id
FROM dbo.Tasks;
It is even easier if the tasks are 1-character long, you can use LEFT(LongTaskName, 1) instead of the ugly SUBSTRING/CHARINDEX mess. But I'm guessing your task names are not one character long (if this is the case, you should include some data that varies a bit so that others don't make assumptions about length).
Now, keep in mind that you'll have to do something like this to keep dbo.GroupedTasks up to date every time a dbo.Tasks row is inserted, updated or deleted. How are you going to keep these two tables in sync?
More to the point, you should consider storing the top priority task separately in the first place, either by using a computed column or separating it out before the insert. Munging data together is something that you do with hash tables and arrays in application code, but it rarely has any positive attributes inside a database. You almost always spend more time and effort extracting the data apart than you ever saved by keeping it together in the first place. This will negate the need for a second table at all.
Select Id, Split( ',', LongTaskName ) as GroupName into TasksWithGroupInfo
Does this answer your question?