sql statement to select previous rows to a search param - sql

Im after an sql statement (if it exists) or how to set up a method using several sql statements to achieve the following.
I have a listbox and a search text box.
in the search box, user would enter a surname e.g. smith.
i then want to query the database for the search with something like this :
select * FROM customer where surname LIKE searchparam
This would give me all the results for customers with surname containing : SMITH . Simple, right?
What i need to do is limit the results returned. This statement could give me 1000's of rows if the search param was just S.
What i want is the result, limited to the first 20 matches AND the 10 rows prior to the 1st match.
For example, SMI search:
Sives
Skimmings
Skinner
Skipper
Slater
Sloan
Slow
Small
Smallwood
Smetain
Smith ----------- This is the first match of my query. But i want the previous 10 and following 20.
Smith
Smith
Smith
Smith
Smoday
Smyth
Snedden
Snell
Snow
Sohn
Solis
Solomon
Solway
Sommer
Sommers
Soper
Sorace
Spears
Spedding
Is there anyway to do this?
As few sql statements as possible.
Reason? I am creating an app for users with slow internet connections.
I am using POSTGRESQL v9
Thanks
Andrew

WITH ranked AS (
SELECT *, ROW_NUMBER() over (ORDER BY surname) AS rowNumber FROM customer
)
SELECT ranked.*
FROM ranked, (SELECT MIN(rowNumber) target FROM ranked WHERE surname LIKE searchparam) found
WHERE ranked.rowNumber BETWEEN found.target - 10 AND found.target + 20
ORDER BY ranked.rowNumber
SQL Fiddle here. Note that the fiddle uses the example data, and I modified the range to 3 entries before and 6 entries past.

I'm assuming that you're looking for a general algorithm ...
It sounds like you're looking for a combination of finding the matches "greater than or equal to smith", and "less than smith".
For the former you'd order by surname and limit the result to 20, and for the latter you'd order by surname descending and limit to 10.
The two result sets can then be added together as arrays and reordered.

I think you need to use ROW_NUMBER() (see this link).
WITH cust1 AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY surname) as numRow FROM customer
)
SELECT c1.surname, c1.numRow, x.flag
FROM cust1 c1, (SELECT *,
case when numRow = (SELECT MIN(numRow) FROM cust1 WHERE surname='Smith') then 1 else 0 end as flag
FROM cust1) x
WHERE x.flag = 1 and c1.numRow BETWEEN x.numRow - 1 AND x.numRow + 1
ORDER BY c1.numRow
SQLFiddle here.
This works, but the flag finally isn't necessary and it would be a query like PinnyM posts.

A variation on #PinnyM's solution:
WITH ranked AS (
SELECT
*,
ROW_NUMBER() over (ORDER BY surname) AS rowNumber
FROM customer
),
minrank AS (
SELECT
*,
MIN(CASE WHEN surname LIKE searchparam THEN rowNumber END) OVER () AS target
FROM ranked
)
SELECT
surname
FROM minrank
WHERE rowNumber BETWEEN target - 10 AND target + 20
;
Instead of two separate calls to the ranked CTE, one to get the first match's row number and the other to read the results from, another CTE is introduced to serve both purposes. Can't speak for PostgreSQL but in SQL Server this might result in a better execution plan for the query, although in either case the real efficiency would still need to be verified by proper testing.

Related

Structuring BigQuery on trigrams with multiple inputs

Presently, thanks to help from the answerer of this question, I am able to successfully query a word, and get a list of the most popular follow-on words. For example, using the word "great", I am able to get a list of up 10 words in the following format:
SELECT second, SUM(cell.page_count) total
FROM [publicdata:samples.trigrams]
WHERE first = "great"
group by 1
order by 2 desc
limit 10
With the output:
second total
------------------
deal 3048832
and 1689911
, 1576341
a 1019511
number 984993
many 875974
importance 805215
part 739409
. 700694
as 628978
What I am currently having trouble figuring out how is how to do this query for multiple words automatically (as opposed to calling a query on a separate word each time) so that I could possibly have a output such as:
"great" total "new_word_1" new_total_1 ... "new_word_N" new_total_N
-----------------------------------------------------------------------------------------
deal 3048832 "new_follow_on_word1" 123456 ... "follow_on_N1" 234567
and 1689911 "new_follow_on_word2" 12345 ... "follow_on_N2" 123456
Where essentially I could call N number of words in a single query (for example, new_word_1 is a totally different word like "baseball", with no relation to "great"), and getting the total counts related to each word on a different column.
Additionally, after learning about the BigQuery's pricing, I am also having trouble figuring out how to limit the total data queried as much possible. I can think of using only the latest data (say, such as 2010 onwards) and 2 alphanumeric outputs per word, but may be missing more obvious limiters
Any help on this is much appreciated - thanks!
You can put multiple first words in the same query, but it will need to compute top 10 following words separately, and then join together the results. Here is an example for "great" and "baseball"
SELECT word1, total1, word2, total2 FROM
(SELECT ROW_NUMBER() OVER() rowid1, word1, total1 FROM (
SELECT second as word1, SUM(cell.page_count) total1
FROM [publicdata:samples.trigrams]
WHERE first = "great"
group by 1
order by 2 desc
limit 10)) a1
JOIN
(SELECT ROW_NUMBER() OVER() rowid2, word2, total2 FROM (
SELECT second as word2, SUM(cell.page_count) total2
FROM [publicdata:samples.trigrams]
WHERE first = "baseball"
group by 1
order by 2 desc
limit 10)) a2
ON a1.rowid1 = a2.rowid2

Query to get number of % sign = length of string in the next row in Oracle 10g

is there any SQL query in Oracle10G which can give the desired output as given required in below sample.
Query should print the name first and in the second row it should print the "%" equal in number with the length of the string.
Could you please help?
Below is the sample of table column
JIM
JOHN
MICHAEL
and the output should come like below :
JIM
%%%
JOHN
%%%%
MICHAEL
%%%%%%%
This would normally be considered an issue for presentation logic, not database logic. However, one option would be to use union all, and then length and rpad to get the correct number of % signs. You'd also need to establish a row number to keep the order together.
Here's one approach:
select name
from (
select name, rownum rn
from yourtable
union all
select rpad('%', length(name), '%') name, rownum + 1 rn
from yourtable ) t
order by rn, name
SQL Fiddle Demo
you can check this link
http://www.club-oracle.com/articles/oracle-pivoting-row-to-column-conversion-techniques-sql-166/
there are many options discussed, it will help

SQL query interleaving two different statuses

I have a table testtable having fields
Id Name Status
1 John active
2 adam active
3 cristy incative
4 benjamin inactive
5 mathew active
6 thomas inactive
7 james active
I want a query that should dispaly the reuslt like
Id Name Status
1 John active
3 cristy incative
2 adam active
4 benjamin inactive
5 mathew active
6 thomas inactive
7 james active
my question is how to take records in the order of active status then inactive then active then inactive etc.. like that from this table.
This query sorts on interleaved active/inactive state:
SELECT [id],
[name],
[status]
FROM (
(
SELECT
Row_number() OVER(ORDER BY id) AS RowNo,
0 AS sorter,
[id],
[name],
[status]
FROM testtable
WHERE [status] = 'active'
)
UNION ALL
(
SELECT
Row_number() OVER(ORDER BY id) AS RowNo,
1 AS sorter,
[id],
[name],
[status]
FROM testtable
WHERE [status] = 'inactive'
)
) innerUnion
ORDER BY ( RowNo * 2 + sorter )
This approach uses an inner UNION on two SELECT statements, one which returns active rows, the other inactive rows. They both have a RowNumber generated, which is later multiplied by two to ensure it's always even. There's a sorter column that's just a bit field, and to ensure that a unique number is available for sorting: adding it to the RowNumber yields either an odd or even number depending on active/inactive state, hence allowing the results to be interleaved.
The SQL Fiddle link is here, to allow testing and manipulation:
http://sqlfiddle.com/#!3/8a8a1/11/0
In the absence of a specified DB system, I've assumed that SQL Server 2008 (or newer) is being used. An alternate row numbering system would be necessary on other DBMSes.
Finally i got the answer
SET #rank=0;
SET #rank1=0;
SELECT #rank:=#rank+1 AS rank,id,name,status FROM `testtablejohn` where status='E'
UNION
SELECT #rank1:=#rank1+1 AS rank,id,name,status FROM `testtablejohn` where status='D'
order by rank
Since you didn't post any example of what you tried so far, I will limit my answer to the general approach as well.
One approach could be to generate a row number for active rows and a row number for inactive rows. Start your numbering for active at 1 and use only odd numbers (that means increase your counter by 2 every time) and do the same thing with 2 and even numbers for the inactive rows. Put those two counters in the same column.
You will end up with a single column to easily sort on in your ORDER BY clause.
Here are some links that might be useful for you:
MySQL - Get row number on select
http://www.mysqltutorial.org/mysql-case-statement/
Just give it a go with those. If you can't make it work, then show us what you tried so far. Post some example code in the question and we might be able to guide you!

SQL Server 2005 - SUM'ing one field, but only for the first occurence of a second field

Platform: SQL Server 2005 Express
Disclaimer: I’m quite a novice to SQL and so if you are happy to help with what may be a very simple question, then I won’t be offended if you talk slowly and use small words :-)
I have a table where I want to SUM the contents of multiple rows. However, I want to SUM one column only for the first occurrence of text in a different column.
Table schema for table 'tblMain'
fldOne {varchar(100)} Example contents: “Dandelion“
fldTwo {varchar(8)} Example contents: “01:00:00” (represents hh:mm:ss)
fldThree {numeric(10,0)} Example contents: “65”
Contents of table:
Row number fldOne fldTwo fldThree
------------------------------------------------
1 Dandelion 01:00:00 99
2 Daisy 02:15:00 88
3 Dandelion 00:45:00 77
4 Dandelion 00:30:00 10
5 Dandelion 00:15:00 200
6 Rose 01:30:00 55
7 Daisy 01:00:00 22
etc. ad nausium
If I use:
Select * from tblMain where fldTwo < ’05:00:00’ order by fldOne, fldTwo desc
Then all rows are correctly returned, ordered by fldOne and then fldTwo in descending order (although in the example data I've shown, all the data is already in the correct order!)
What I’d like to do is get the SUM of each fldThree, but only from the first occurrence of each fldOne.
So, SUM the first Dandelion, Daisy and Rose that I come across. E.g.
99+88+55
At the moment, I’m doing this programmatically; return a RecordSet from the Select statement above, and MoveNext through each returned row, only adding fldThree to my ‘total’ if I’ve never seen the text from fldOne before. It works, but most of the Select queries return over 100k rows and so it’s quite slow (slow being a relative term – it takes about 50 seconds on my setup).
The actual select statement (selecting about 100k rows from 1.5m total rows) completes in under a second which is fine. The current programatic loop is quite small and tight, it's just the number of loops through the RecordSet that takes time. I'm using adOpenForwardOnly and adLockReadOnly when I open the record set.
This is a routine that basically runs continuously as more data is added, and also the fldTwo 'times' vary, so I can't be more specific with the Select statement.
Everything that I’ve so far managed to do natively with SQL seems to run quickly and I’m hoping I can take the logic (and work) away from my program and get SQL to take the strain.
Thanks in advance
The best way to approach this is with window functions. These let you enumerate the rows within a group. However, you need some way to identify the first row. SQL tables are inherently unordered, so you need a column to specify the ordering. Here are some ideas.
If you have an id column, which is defined as an identity so it is autoincremented:
select sum(fldThree)
from (select m.*,
row_number() over (partition by fldOne order by id) as seqnum
from tblMain m
) m
where seqnum = 1
To get an arbitrary row, you could use:
select sum(fldThree)
from (select m.*,
row_number() over (partition by fldOne order by (select NULL as noorder)) as seqnum
from tblMain m
) m
where seqnum = 1
Or, if FldTwo has the values in reverse order:
select sum(fldThree)
from (select m.*,
row_number() over (partition by fldOne order by FldTwo desc) as seqnum
from tblMain m
) m
where seqnum = 1
Maybe this?
SELECT SUM(fldThree) as ExpectedSum
FROM
(SELECT *, ROW_NUMBER() OVER (PARTITION BY fldOne ORDER BY fldTwo DSEC) Rn
FROM tblMain) as A
WHERE Rn = 1

Any option except cursor in this kind of group by?

I have a sample data as:
Johnson; Michael, Surendir;Mishra, Mohan; Ram
Johnson; Michael R.
Mohan; Anaha
Jordan; Michael
Maru; Tushar
The output of the query should be:
Johnson; Michael 2
Mohan; Anaha 1
Michael; Jordon 1
Maru; Tushar 1
Surendir;Mishra 1
Mohan; Ram 1
As you can see it is print the count of each name separated by , but with a twist. We cannot simply do a groupby on full name because sometimes the name may contain middle name 1st initial and sometimes it may not. Eg. Johnson; Michael and Johnson; Michael R. are counted as single name and hence their count is 2. Further either Johnson; Michael should appear or Johnson; Michael R. should appear in resultset with count of 2 (not both because that would be repeated record)
The table contains names separated by , and it is not possible to denormalize it as it is LIVE and given to us by someone else.
Is there anyway to write a query for this without using cursor? I have around 3 million records in my DB and I have to support pagination etc also. What do you think would be the best way to achieve this?
This is why your data should be normalised.
;with cte as
(
select 1 as Item, 1 as Start, CHARINDEX(',',People+',' , 1) as Split,
People+',' as People
from YourHorribleTable
union all
select cte.Item+1, cte.Split+1, nullif(CHARINDEX(',',people, cte.Split+1),0), People as Split
from cte
where cte.Split<>0
)
select Person, COUNT(*)
from
(
select case when nullif(charindex (' ', person, 2+nullif(CHARINDEX(';', person),0)),0) is null then person
else substring(person,1,charindex (' ', person, 2+nullif(CHARINDEX(';', person),0)))
end as Person
from
(
select LTRIM(RTRIM( SUBSTRING(people, start,isnull(split,len(People)+1)-start))) as person
from cte
) v
where person<>''
) v
group by Person
order by COUNT(*) desc