Count the matched values in sqlite - sql

I want to count the matched values in data like in (table1)
name id subject
maria 01 Math computer english
faro 02 Computer stat english
hina 03 Chemistry physics bio
The below query
Select *
from table1
where subject like ‘%english%’ or
subject like ‘%stat%’
returns first two rows :
But I also need to count the matched values like below output
count
1
2
0
(Because in the first row only one value matches, in the second row two matches and in third row there are no matches).
can i get that desired output??

You may try summing CASE expressions which check each condition:
SELECT
subject,
CASE WHEN subject LIKE '%english%' THEN 1 ELSE 0 END +
CASE WHEN subject LIKE '%stat%' THEN 1 ELSE 0 END AS count
FROM
yourTable;
If you instead wanted to get a count of the number of words in each subject which did not match to one of the two keywords, you could try:
SELECT
subject,
LENGTH(subject) - LENGTH(REPLACE(subject, ' ', '')) + 1 -
( subject LIKE '%English%' ) - ( subject LIKE '%stat%' ) AS count
FROM yourTable;
Demo

SQLite validates conditions returning boolean values with 1 and 0 for true and false respectively, so you can do it like this:
select *,
(subject like '%english%')
+
(subject like '%stat%') as count
from table1

Related

SQL: extracting a number in middle of a cell from a different table

I need help extracting a number within a number from a different table.
I'll explain:
1st table has phone numbers.
example: +12125634533, +41542858585
2nd table has
country code | second column
-------------+--------------
1 | usa
41 | switzerland
how do I get the operator within the numbers?
example:
+12125634533 -- operator is 212 (1 is the country code, 5634533 is the phone number - always 7 numbers)
+41542858585 -- operator is 54 (41 is the country code, 2858585 is the phonen number).
Assuming the phone prefixes are unique and the phone numbers always start with +, then something like this:
select t1.*,
substring(t1.phonenumber, 2 + len(t2.countrycode),
len(t1.phonenumber) - 7 - len(t2.countrycode) - 1
)
from table1 t1 left join
table2 t2
on t1.phonenumber like '+' + t2.countrycode + '%';
Here is a db<>fiddle.

SQL Rows to Columns if column values are unknown

I have a table that has demographic information about a set of users which looks like this:
User_id Category IsMember
1 College 1
1 Married 0
1 Employed 1
1 Has_Kids 1
2 College 0
2 Married 1
2 Employed 1
3 College 0
3 Employed 0
The result set I want is a table that looks like this:
User_Id|College|Married|Employed|Has_Kids
1 1 0 1 1
2 0 1 1 0
3 0 0 0 0
In other words, the table indicates the presence or absence of a category for each user. Sometimes the user will have a category where the value if false, sometimes the user will have no row for a category, in which case IsMember is assumed to be false.
Also, from time to time additional categories will be added to the data set, and I'm wondering if its possible to do this query without knowing up front all the possible category names, in other words, I won't be able to specify all the column names I want to count in the result. (Note only user 1 has category "has_kids" and user 3 is missing a row for category "married"
(using Postgres)
Thanks.
You can use jsonb funcions.
with titles as (
select jsonb_object_agg(Category, Category) as titles,
jsonb_object_agg(Category, -1) as defaults
from demog
),
the_rows as (
select null::bigint as id, titles as data
from titles
union
select User_id, defaults || jsonb_object_agg(Category, IsMember)
from demog, titles
group by User_id, defaults
)
select id, string_agg(value, '|' order by key)
from (
select id, key, value
from the_rows, jsonb_each_text(data)
) x
group by id
order by id nulls first
You can see a running example in http://rextester.com/QEGT70842
You can replace -1 with 0 for the default value and '|' with ',' for the separator.
You can install tablefunc module and use the crosstab function.
https://www.postgresql.org/docs/9.1/static/tablefunc.html
I found a Postgres function script called colpivot here which does the trick. Ran the script to create the function, then created the table in one statement:
select colpivot ('_pivoted', 'select * from user_categories', array['user_id'],
array ['category'], '#.is_member', null);

SQL: Find rows that match closely but not exactly

I have a table inside a PostgreSQL database with columns c1,c2...cn. I want to run a query that compares each row against a tuple of values v1,v2...vn. The query should not return an exact match but should return a list of rows ordered in descending similarity to the value vector v.
Example:
The table contains sports records:
1,USA,basketball,1956
2,Sweden,basketball,1998
3,Sweden,skating,1998
4,Switzerland,golf,2001
Now when I run a query against this table with v=(Sweden,basketball,1998), I want to get all records that have a similarity with this vector, sorted by number of matching columns in descending order:
2,Sweden,basketball,1998 --> 3 columns match
3,Sweden,skating,1998 --> 2 columns match
1,USA,basketball,1956 --> 1 column matches
Row 4 is not returned because it does not match at all.
Edit: All columns are equally important. Although, when I really think of it... it would be a nice add-on if I could give each column a different weight factor as well.
Is there any possible SQL query that would return the rows in a reasonable amount of time, even when I run it against a million rows?
What would such a query look like?
SELECT * FROM countries
WHERE country = 'sweden'
OR sport = 'basketball'
OR year = 1998
ORDER BY
cast(country = 'sweden' AS integer) +
cast(sport = 'basketball' as integer) +
cast(year = 1998 as integer) DESC
It's not beautiful, but well. You can cast the boolean expressions as integers and sum them.
You can easily change the weight, by adding a multiplicator.
cast(sport = 'basketball' as integer) * 5 +
This is how I would do it ... the multiplication factors used in the case stmts will handle the importance(weight) of the match and they will ensure that those records that have matches for columns designated with the highest weight will come up top even if the other columns don't match for those particular records.
/*
-- Initial Setup
-- drop table sport
create table sport (id int, Country varchar(20) , sport varchar(20) , yr int )
insert into sport values
(1,'USA','basketball','1956'),
(2,'Sweden','basketball','1998'),
(3,'Sweden','skating','1998'),
(4,'Switzerland','golf','2001')
select * from sport
*/
select * ,
CASE WHEN Country='sweden' then 1 else 0 end * 100 +
CASE WHEN sport='basketball' then 1 else 0 end * 10 +
CASE WHEN yr=1998 then 1 else 0 end * 1 as Match
from sport
WHERE
country = 'sweden'
OR sport = 'basketball'
OR yr = 1998
ORDER BY Match Desc
It might help if you wrote a stored procedure that calculates a "similarity metric" between two rows. Then your query could refer to the return value of that procedure directly rather than having umpteen conditions in the where-expression and the order-by-expression.

SQL using fallback column for match

Say I have a table in an sql database like
name age shoesize
---------------------
tom 20 NULL
dick NULL 4
harry 30 5
and I want an SQL statement that selects names that have age == X, or as a fallback, if no such names exist, use those with shoe size == Y. In other words, in this table, for X=20,Y=4 I should only get 'tom', while for X=25,Y=4 I should get only 'dick'. I can't do that with
SELECT name FROM table WHERE age = 20 OR shoe size = 4;
because that will select both tom and dick. I'm currently using
SELECT COALESCE ((SELECT name FROM tab WHERE age = 20),(SELECT name FROM tab WHERE shoesize = 4));
but is there a neater way? Also using coalesce like this doesn't allow me to get the whole row - i.e. I can't use SELECT * FROM tab, I can only select a single name.
You can use ORDER BY and FETCH FIRST 1 ROW ONLY or some similar clause:
SELECT name
FROM tab
ORDER BY (CASE WHEN age = X THEN 1
WHEN shoesize = Y THEN 2
ELSE 3
END)
FETCH FIRST 1 ROW ONLY;
Some databases spell FETCH FIRST 1 ROW ONLY like LIMIT or TOP or even something else.

Searching for a number in a database column where column contains series of numbers seperated by a delimeter '"&" in SQLite

My table structure is as follows :
id category
1 1&2&3
2 18&2&1
3 11
4 1&11
5 3&1
6 1
My Question: I need a sql query which generates the result set as follows when the user searched category is 1
id category
1 1&2&3
2 18&2&1
4 1&11
5 3&1
6 1
but i am getting all the results not the expected one
I have tried regexp and like operators but no success.
select * from mytable where category like '%1%'
select * from mytable where category regexp '([.]*)(1)(.*)'
I really dont know about regexp I just found it.
so please help me out.
For matching a list item separated by &, use:
SELECT * FROM mytable WHERE '&'||category||'&' LIKE '%&1&%';
this will match entire item (ie, only 1, not 11, ...), whether it is at list beginning, middle or end.