I am currently creating a selection query and observed a strange behaviour. Perhaps someone can explain this.
Please check out the following query:
SELECT
(SELECT count(*) from [DummyData].[dbo].[Users]) as numberOfEntriesInDummy,
(ABS(checksum(Name) % (SELECT count(*) from [DummyData].[dbo].[Users])) + 1) as randomId,
(Select [Name] FROM [DummyData].[dbo].[Users] WHERE Id = (ABS(checksum(Name) % (SELECT count(*) from [DummyData].[dbo].[Users])) + 1)) as randName
FROM [invoiceR-Test].[dbo].[AbpUsers]
This query gives me this result:
numberOfEntriesInDummy randomId randName
14 9 Leano
14 9 Leano
14 3 Leano
14 5 Leano
14 13 Leano
14 11 Leano
What I do not understand is why gives the "randName" column always the same result? "Leano" is only once in the [DummyData].[dbo].[Users]-Table and has the ID 7. Actually I would expect that the last column has changing names in it.
The second column shows clearly that the randomId is actually changing form row to row - but the queried result in the last column is always the same. For me it looks like the query result to the [DummyData].[dbo].[Users]-Table is somehow cached...
I think this is an error because you have not qualified your column references. The relevant part of your query is:
SELECT . . .
(SELECT [Name]
FROM [DummyData].[dbo].[Users]
WHERE Id = (ABS(checksum(Name) %
(SELECT count(*) FROM [DummyData].[dbo].[Users])) + 1)) as randName
)
FROM [invoiceR-Test].[dbo].[AbpUsers];
My guess is that you intend:
SELECT . . .
(SELECT u.[Name]
FROM [DummyData].[dbo].[Users] u
WHERE u.Id = (ABS(checksum(u.Name) %
(SELECT count(*) FROM [DummyData].[dbo].[Users])) + 1)) as randName
)
FROM [invoiceR-Test].[dbo].[AbpUsers] au;
However, users doesn't have a name column, so this is interpreted as:
SELECT . . .
(SELECT au.[Name]
FROM [DummyData].[dbo].[Users] u
WHERE u.Id = (ABS(checksum(au.Name) %
(SELECT count(*) FROM [DummyData].[dbo].[Users])) + 1)) as randName
)
FROM [invoiceR-Test].[dbo].[AbpUsers] au;
The moral is to always qualify ALL column references in a query.
Related
I would like to divide a long text in multiple rows; there are other questions similar to this one but none of them worked for me.
What I have
ID | Message
----------------------------------
1 | Very looooooooooooooooong text
2 | Short text
What I would like to do is divide that string every n characters
Result if n = 15:
Id | Message
------------------------------------------
1 | Very looooooooo
1 | oooooooong text
2 | Short text
Even better if the split is done at the first space after n character.
I tried with string_split and substring but I cannot find anything that works.
I thought to use something similar to this:
SELECT index, element FROM table, CAST(message AS SUPER) AS element AT index;
But it doesn't take into account the length and I don't like casting a varchar variable into a super.
You can use generate_series() to accomplish this:
select m.*, gs.posn, substring(m.message, gs.posn, 15) as split_message
from messages m
cross join lateral generate_series(1, length(message), 15) gs(posn);
Splitting on spaces after the length is a little trickier. We would have to split the message into words and then figure out how to break them into groups and then reaggregate.
I could not figure out how to split on spaces without recursion. I hope you don't mind that it treats all whitespace as word boundaries:
with recursive by_words as (
select m.*, s.n, s.word, length(s.word) as word_len,
max(s.n) over (partition by m.id) as num_words
from messages m
cross join lateral regexp_split_to_table(m.message, '\s+')
with ordinality as s(word, n)
), rejoin as (
select id, n, array[word] as words, word_len as cum_word_len,
word_len >= 15 as keep
from by_words
where n = 1
union all
select p.id, c.n,
case
when p.cum_word_len >= 15 then array[c.word]
else p.words||c.word
end as words,
case
when p.cum_word_len >= 15 then c.word_len
else p.cum_word_len + c.word_len + 1
end as cum_word_len,
(p.cum_word_len + c.word_len + 1 >= 15)
or (c.n = c.num_words) as keep
from rejoin p
join by_words c on (c.id, c.n) = (p.id, p.n + 1)
)
select id,
row_number() over (partition by id
order by n) as segnum,
array_to_string(words, ' ') as split_message
from rejoin
where keep
order by 1, 2
;
db<>fiddle here
Edit to add:
Can you please tell me whether the below works in Redshift?
with gs as (
select generate_series as posn
from generate_series(1, 150000, 15)
)
select *, substring(m.message, gs.posn, 15) as split_message
from messages m
join gs
on gs.posn <= greatest(1, length(m.message))
order by m.id, gs.posn
;
Thanks to #Mike Organek 's answer and his help I found a solution that works with Redshift too.
Problem in Mike's answer for Redshift is related to generate_series that is not well supported in Redshift, so here's a workaround.
with row as (
select t.*, row_number() over () as x
from table t -- big enough table
limit 100
),
result as
(
select (x-1)*15+1 as posn from row --change 15 to a number to split the long text with
)
select * into gs
from result
And then Mike's answer:
select *, substring(m.feedback from gs.posn for 15) as split_message
from messages m
join gs
on gs.posn <= greatest(1, length(m.message))
order by m.id, gs.posn
Here I have two tables as student_information and exmaination_marks.
examination_marks table have 3 columns for three subjects and include their marks.
I want to select the roll_number and name of the student from the student_information table where sum of the three subject's marks in examination_marks table is less than 100.
Both table has roll_number as primary key.
Here is the query I wrote.
select
si.roll_number,
si.name
from
student_information as si
left outer join examination_marks as em on
si.roll_number = em.roll_number
where
sum(em.subject_one + em.subject_two + em.subject_three) < 100;
But I got an error saying "ERROR 1111 (HY000) at line 1: Invalid use of group function"
Can any one help me with this?
sum(em.subject_one + em.subject_two + em.subject_three)< 100
this is the problem . Try these
Where (SELECT subject_one + subject_two + subject_three FROM examination_marks WHERE em.roll_number = si.roll_number) < 100
SUM is an "aggregate function" which can only be used inside a query which has a GROUP BY clause.
To get the sum of values within the same row you need to use the + operator. If the columns are NULL-able then you'll also need to use COALESCE (or ISNULL) to prevent NULL values invalidating your entire expression.
Like so:
SELECT
si.roll_number,
si.name,
COALESCE( em.subject_one, 0 ) + COALESCE( em.subject_two, 0 ) + COALESCE( em.subject_three, 0 ) AS sum_marks
FROM
student_information AS si
LEFT OUTER JOIN examination_marks AS em ON
si.roll_number = em.roll_number
WHERE
COALESCE( em.subject_one, 0 ) + COALESCE( em.subject_two, 0 ) + COALESCE( em.subject_three, 0 ) < 100;
(If you're wondering why the COALESCE( em.subje... expression is repeated in the SELECT and WHERE clauses, that's because SQL is horribly designed by (obscene profanities) is an unnecessarily verbose language).
I'm looking for an explanation for why 1 of the following 3 queries aren't returning what I am expecting.
-- Query 1
SELECT ANNo, ANCpr
FROM Anmodning
WHERE LEFT(ANCpr,6) + '-' + RIGHT(ANCpr,4) NOT IN (SELECT PSCpr FROM Person)
-- Query 2
SELECT ANNo, ANCpr
FROM Anmodning a
LEFT JOIN Person p ON p.PSCpr = LEFT(a.ANCpr,6) + '-' + RIGHT(a.ANCpr,4)
WHERE p.PSNo IS NULL
-- Query 3
SELECT ANNo, ANCpr
FROM Anmodning
WHERE ANNo NOT IN
(
SELECT ANNo
FROM Anmodning
WHERE LEFT(ANCpr,6) + '-' + RIGHT(ANCpr,4) IN (SELECT PSCpr FROM Person)
)
Assume the following:
Anmodning with ANNo=1, ANCpr=1111112222
And the Person table doesn't have a row with PSCpr=111111-2222
Queries are executed in Management Studio against a SQL Server 2017.
Queries 2 and 3 returns the Anmodning row as expected but query 1 does not.
Why is that?
I suspect the issue with the first query is a null-safety problem. If there are null values in Person(PSCpr), then the not in condition filters out all Anmodning rows, regardless of other values in Person.
Consider this simple example:
select 1 where 1 not in (select 2 union all select null)
Returns no rows, while:
select 1 where 1 not in (select 2 union all select 3)
Returns 1 as you would expect.
This problem does not happen when you use left join, as in the second query.
You could also phrase this with not exists, which is null-safe, which I would recommend here:
SELECT ANNo, ANCpr
FROM Anmodning a
WHERE NOT EXITS (SELECT 1 FROM Person p WHERE p.PSCpr = LEFT(a.ANCpr,6) + '-' + RIGHT(a.ANCpr,4))
I have a request, please tell me how you can optimize it?
select distinct
trunc(dw.mdf_date) as mdf_date
,dw.dss_id
,dw.raid
,dw.host_type
,dw.volume_name
,dw.volume_size
,dw.prv
,listagg(dw.hba_wwn,',' on overflow truncate '...' without count) within group (order by dw.hba_wwn) as wwn
from
dss_wwn dw
where
dw.volume_name not in ('ADMIN')
and dw.volume_name not like '.%'
and dw.hba_wwn is not null
and not exists (select 1 from wwn shw where shw.wwn = dw.hba_wwn and shw.dic_type_eqp_id = 4 and rownum = 1)
and not exists (select 1 from dss_vmhdd shw where shw.wwid = dw.disk_wwn and rownum = 1)
group by
trunc(dw.mdf_date)
,dw.dss_id
,dw.raid
,dw.host_type
,dw.volume_name
,dw.volume_size
,dw.prv
This request works for 23 seconds.
And if you comment out this line, That works fast 0.2 seconds
and not exists (select 1 from wwn shw where shw.wwn = dw.hba_wwn and shw.dic_type_eqp_id = 4 and rownum = 1)
select count(*) from DSS_WWN --100000
select count(*) from WWN t --13000
UPD #Gro Thanks for your answer, really after I removed rownum=1 request fulfilled in 0.4 seconds
I have a a table like:
Id Word
--- ----
1 this
2 is
3 a
4 cat.
5 that
6 is
7 a
8 dog.
9 and
10 so
11 on
and need to add a new column for sentence number base on dot character:
Id Word S#
--- ---- --
1 this 1
2 is 1
3 a 1
4 cat. 1
5 that 2
6 is 2
7 a 2
8 dog. 2
9 and 3
10 so 3
11 on 3
what is the best solution from the performance aspect??
select table.id, table.word, count(*) + 1 as serial_number
from table left join
( select id, word from table where word like '%.' ) Z
on table.id > Z.id
group by table.id, table.word
You're assuming that sentences are formed by ascending id number. That's a really bad idea.
This query should give you information about the sentence breaks. (Replace "T" with the real table name.)
SELECT
Break1.Id as BreakId,
COALESCE(MAX(Break2.Id), 0) as PreviousBreakId,
COALESCE(COUNT(Break2.Id), 0) + 1 as BreakNumber
FROM
(SELECT Id FROM T WHERE Word LIKE '%.') as Break1,
(SELECT Id FROM T WHERE Word LIKE '%.') as Break2
WHERE Break2.Id < Break1.Id
GROUP BY Break1.Id
Here's how you might use it in an UPDATE.
UPDATE T
SET SentenceNum = (
SELECT B.BreakNumber
FROM
(
SELECT
Break1.Id as BreakId,
COALESCE(MAX(Break2.Id), 0) as PreviousBreakId,
COALESCE(COUNT(Break2.Id), 0) + 1 as BreakNumber
FROM
(SELECT Id FROM T WHERE Word LIKE '%.') as Break1,
(SELECT Id FROM T WHERE Word LIKE '%.') as Break2
WHERE
Break2.Id < Break1.Id
GROUP BY Break1.Id
) as B
WHERE T.Id >= B.PreviousBreak AND T.Id < B.BreakId
)
I offer the query for educational value but I can't condone the approach based on your information.
EDIT
My original version had a problem with the first sentence because basically the logic looks for a preceding sentence break that doesn't exist. #cravoori's solution handles this via a left join. Here's a working version in the same spirit of my own answer which returns the full list of words rather than the breaks. Except for the cross join and the dummy zero row, at heart it's the same.
SELECT T.Id, MIN(T.Word) as Word, COUNT(Breaks.Id) as SentenceNumber
FROM T, (SELECT 0 as Id UNION ALL SELECT Id FROM T WHERE Word LIKE '%.') as Breaks
WHERE Breaks.Id < T.Id
GROUP BY T.Id;
Check this query to get the desired output:
DECLARE #Sentence TABLE(idn int identity,word varchar(50))
INSERT INTO #Sentence
VALUES('this'),('is'),('a'),('cat.'),('that'),('is'),('a'),('dog.'),('and'),('so'),('on.'),('I'),('love'),('india.')
--SELECT * FROM #Sentence
DECLARE #seq int=1
;WITH CTE(idn,word,seq) AS(
SELECT idn,word,CASE WHEN word not like '%.' then #seq END from #Sentence where idn=1
union all
SELECT s.idn,s.word,CASE WHEN s.word like '%.' then c.seq+1 else c.seq END from #Sentence s inner join CTE c on s.idn-1=c.idn
)
,CTE1(idn,word,seq) As
(SELECT idn,word,CASE WHEN word like '%.' then seq-1 else seq end as seq from CTE)
SELECT * FROM CTE1