One result from join between 2 tables? - sql

It is a bit complicated to explain into the main caption, but the situation is as follow :
- I have 2 tables :
Users, Preferences.
defined as follow
Users :
ID | NAME | FK_NODE_PREFERENCES
Preferences :
ID | FK_USERS | PREFERENCE_DESCRIPTION
the idea is to have alot of preferences for each user...
the results shall look as follow :
USER | ALL_PREFERENCES
I need to search by a part of the preference string and i need to have only 1 row into the result select query, which having all the preferences related ot the user as a text into a single record ?

I'm not familiar with Firebird, but it seems to have a LIST() function which should be an equivalent to the GROUP_CONCAT() function in MySQL. (http://www.firebirdsql.org/refdocs/langrefupd21-aggrfunc-list.html)
So, basically the query should look something like this:
SELECT Users.name, LIST(Preferences.preference_description, ' ') AS ALL_PREFERENCES FROM Users JOIN NodePreferences ON Users.fk_node_preferences = Preferences.id WHERE Preferences.preference_description LIKE '%abc%' GROUP BY Users.name
So... not sure if this actually works, but the direction should be the right one...
Hope that helps!

Related

Select Last movement for each ID in Access

I am working on a MS-Access table, and would like to have a query to result all the information on the last entry from a certain id.
My table (DEPOSIT_MOVEMENTS) is the following:
MOV_CODE | WORK | DEPOSIT_CODE | TYPE | DATE | DESTINATION
I am looking to obtain for each DEPOSIT_CODE the latest register (on date), and obtain the *MOV_CODE so that I can get the DESTINATION of the item.
DEPOSIT_CODE may have many MOV_CODE on different dates.
I have tried with different options posted on stackoveflow, but I coudl not get any of these to work properly.
Right now I am trying with the GROUP BY, but cannot get it working.
SELECT t1.[DEPOSIT_CODE], MAX(t1.[DATE]), t1.[MOV_CODE]
FROM [DEPOSIT_MOVEMENTS] AS t1
GROUP BY t1.[DEPOSIT_CODE];
Any help or guidance is welcome.
Kind regards,
Here is one method:
SELECT dm.*
FROM [DEPOSIT_MOVEMENTS] AS dm
WHERE dm.DATE = (SELECT MAX(dm2.DATE)
FROM [DEPOSIT_MOVEMENTS] AS dm2
WHERE dm2.DEPOSIT_CODE = dm.DEPOSIT_CODE
);

How to Group_Concat with a 3-table JOIN for genealogy

I am failing to grasp how I can get the following outcome. I thought perhaps via GROUP_CONCAT, but I am also joining on 3 tables, and unclear on the correct syntax or if this is even the best approach.
Generic table layout:
Table Users: user_id | first | last
Table Orgs org_id | org_name
Table Relationship user_id | org_id | start_year | end_year
The relationship table has MANY entries, that may be associated with that specific user_id.
I need to get the User columns: id, first, last. I'd like to try and group the org data into 1 concatenated, delimited field. Maybe a double group_concatenation is needed? Which would consist of the org_id, org_name, start_year & end_year for all records in the relationship table that match the user_id. I'm hoping for an output like this:
Each '|' represents a new column/piece of data.
If there was only 1 org_id associated with the user_id, the output would be (similar) to:
user_id | first | last | org_id-org_name-start_year-end_year
If there were more than 1 org found/associated with that user_id, the output would have more concatenated/delimited data in the same column:
user_id | first | last | org_id-org_name-start_year-end_year^org_id-org_name-start_year-end_year^org_id-org_name-start_year-end_year
(Notice the '-' delimiter between values and the '^' delimiter between new 'org-grouped' data.)
When I grab that data, I can then just break it up (on the backend/PHP side of things) into an array or whatever.
I'm not sure how I can GROUP_CONCAT (if that is even the best approach here?) while I have to JOIN on 3 separate tables.
This is not my REAL query. (I'm not sure if I should post it, as I do not want to cause any confusion as it does NOT match my dummy table/column names.)
I just wanted to show my attempt that gets me 3 individual rows, (using my JOINS) but no GROUP_CONCAT stuff:
SELECT genealogy_users.imis_id, genealogy_users.full_name,
genealogy_users.member_email, genealogy_orgs.org_id,
genealogy_orgs.org_name, genealogy_relations.user_id,
genealogy_relations.relation_type, genealogy_relations.start_year,
genealogy_relations.end_year
FROM genealogy_users
INNER JOIN genealogy_relations ON genealogy_users.imis_id = genealogy_relations.user_id
INNER JOIN genealogy_orgs ON genealogy_relations.org_id = genealogy_orgs.org_id
WHERE genealogy_users.imis_id = '00003';
UPDATE:
Well I seemed to have fudged my way through it. But I'm not sure how legit this is.
Its -ALMOST- there. I believe I still need a JOIN or something? Since the genealogy_orgs.org_id = '84864' is hardcoded, and it should NOT be. Maybe it needs to come from a JOIN or something?
SELECT genealogy_users.*,
(SELECT GROUP_CONCAT(org_id,'-',
(SELECT org_name FROM genealogy_orgs WHERE genealogy_orgs.org_id = '84864'),
'-',start_year,'-',end_year,'^')
FROM genealogy_relations WHERE genealogy_relations.user_id = genealogy_users.imis_id
) AS alumni_list
FROM genealogy_users
WHERE genealogy_users.imis_id = '00003';
UPDATE 2:
My final attempt, which I think is getting me what I need. (But it's late, and I'll check back tomorrow and look at things more closely.)
SELECT genealogy_users.imis_id, genealogy_users.full_name,
genealogy_users.member_email, genealogy_orgs.org_id,
genealogy_orgs.org_name, genealogy_relations.user_id,
genealogy_relations.relation_type, genealogy_relations.start_year,
genealogy_relations.end_year,
(SELECT GROUP_CONCAT(org_id,'-',org_name,'-',start_year,'-',end_year,'^')
FROM genealogy_relations
WHERE genealogy_relations.user_id = genealogy_users.imis_id
) AS alumni_list
FROM genealogy_users
INNER JOIN genealogy_relations ON genealogy_users.imis_id = genealogy_relations.user_id
INNER JOIN genealogy_orgs ON genealogy_relations.org_id = genealogy_orgs.org_id
WHERE genealogy_users.imis_id = '00003';
Is there anything to make note of in the above attempt? Or is there a better approach? Hopefully something easily readable so it makes sense?

Ways to Clean-up messy records in sql

I have the following sql data:
ID Company Name Customer Address 1 City State Zip Date
0108500 AAA Test Mish~Sara Newa Claims Chtiana CO 123 06FE0046
0108500 AAA.Test Mish~Sara Newa Claims Chtiana CO 123 06FE0046
1802600 AAA Test Company Ban, Adj.~Gorge PO Box 83 MouLaurel CA 153 09JS0025
1210600 AAA Test Company Biwel~Brce 97kehst ve Jacn CA 153 04JS0190
AAA Test, AAA.Test and AAA Test Company are considered as one company.
Since their data is messy I'm thinking either to do this:
Is there a way to search all the records in the DB wherein it will search the company name with almost the same name then re-name it to the longest name?
In this case, the AAA Test and AAA.Test will be AAA Test Company.
OR Is there a way to filter only record with company name that are almost the same then they can have option to change it?
If there's no way to do it via sql query, what are your suggestions so that we can clean-up the records? There are almost 1 million records in the database and it's hard to clean it up manually.
Thank you in advance.
You could use String matching algorithm like Jaro-Winkler. I've written an SQL version that is used daily to deduplicate People's names that have been typed in differently. It can take awhile but it does work well for the fuzzy match you're looking for.
Something like a self join? || is ANSI SQL concat, some products have a concat function instead.
select *
from tablename t1
join tablename t2 on t1.companyname like '%' || t2.companyname || '%'
Depending on datatype you may have to remove blanks from the t2.companyname, use TRIM(t2.companyname) in that case.
And, as Miguel suggests, use REPLACE to remove commas and dots etc.
Use case-insensitive collation. SOUNDEX can be used etc etc.
I think most Database Servers support Full-Text search ability, and if so there are some functions related to Full-Text search that support Proximity.
for example there is a Near function in SqlServer and here is its documentation https://msdn.microsoft.com/en-us/library/ms142568.aspx
You can do the clean-up in several stages.
Create new columns
Convert everything to upper case, remove punctuation & whitespace, then match on the first 6 to 10 characters (using self join). Assuming your table is called "vendor": add two columns, "status", "dupstr", then update as follows
/** Populate dupstr column for fuzzy match **/
update vendor v
set v.dupstr = left(upper(regex_replace(regex_replace(v.companyname,'.',''),' ','')),6)
;
Identify duplicate records
Add an index on the dupstr column, then do an update like this to identify "good" records:
/** Mark the good duplicates **/
update vendor v
set v.status = 'keep' --indicate keeper record
where
--dupes to clean up
exists ( select 1 from vendor v1 where v.dupstr = v1.dupstr
and v.id != v1.id )
and
( --keeper has longest name
length(v.companyname) =
( select max(length(v2.companyname)) from vendor v2
where v.dupstr = v2.dupstr
)
or
--keeper has latest record (assuming ID is sequential)
v.id =
( select max(v3.id) from vendor v3
where v.dupstr = v3.dupstr
)
)
group by v.dupstr
;
The above SQL can be refined to add "dupe" status to other records , or you can do a separate update.
Clean Up Stragglers
Report any remaining partial matches to be reviewed by a human (i.e. dupe records without a keeper record)
You can use SQL query with SOUDEX of DIFFRENCE
For example:
SELECT DIFFERENCE ('AAA Test','AAA Test Company')
DIFFERENCE returns 0 - 4 ( 4 = almost the same, 0 - totally diffrent)
See also: https://learn.microsoft.com/en-us/sql/t-sql/functions/difference-transact-sql?view=sql-server-2017

SQL Server 2008 AND/OR operators and parentheses

I am having a bit of trouble with AND/OR operators. It's easy stuff, but I'm having trouble with the following scenario:
WHERE (ID = '111') OR (charge_desc LIKE '%garn%') OR
(charge_desc LIKE '%levy%') OR
(charge_desc LIKE '%exe%')
Basically, I am trying to find the keywords based only on ID 111. However, what's happening with the current query is that it looks for those keywords based on a bunch of different IDs instead. I've tried to manipulate the parenthesis, and I see the issue. Because it can't be "111 OR LIKE XYZ..." but it can't be AND either because if you do AND, then it looks for that ID based on EXACTLY those 3 criteria.
I am looking to do ID 111 and look for garn or levy or exe - not "AND" but it should be every variation that comes up but ONLY for 111.
I hope this makes sense. I feel its something with the parenthesis.
I just tried this:
WHERE ID = 111
HAVING garn or levy or exe
Forgive my lack of syntax there. :)
If you absolutely must have id = 111 then this is an and relationship, not an or relationship. You can then have a series of or operators between the other conditions:
WHERE ID = '111' AND
(charge_desc LIKE '%garn%' OR
charge_desc LIKE '%levy%' OR
charge_desc LIKE '%exe%')
The problem is with the parenthesis.
Let's look at this scenario:
WHERE (ID = '111') OR (charge_desc LIKE '%garn%') OR
(charge_desc LIKE '%levy%') OR
(charge_desc LIKE '%exe%')
You are saying where ID = 111 OR Charg_Desc like garn...etc.
WHERE (ID = '111') AND
(charge_desc LIKE '%garn%' OR
charge_desc LIKE '%levy%' OR
charge_desc LIKE '%exe%')
Now, you're saying WHERE ID = 111 AND the following conditional OR is applied to bring in the additional match. Remember, the match is in addition to the ID match.

SQL - Getting a column from another table to join this query

I've got the code below which displays the location_id and total number of antisocial crimes but I would like to get the location_name from a different table called location_dim be output as well. I tried to find a way to UNION it but couldn't get it to work. Any ideas?
SELECT fk5_location_id , COUNT(fk3_crime_id) as TOTAL_ANTISOCIAL_CRIMES
from CRIME_FACT
WHERE fk1_time_id = 3 AND fk3_crime_id = 1
GROUP BY fk5_location_id;
You want to use join to lookup the location name. The query would probably look like this:
SELECT ld.location_name, COUNT(cf.fk3_crime_id) as TOTAL_ANTISOCIAL_CRIMES
from CRIME_FACT cf join
LOCATION_DIM ld
on cf.fk5_location_id = ld.location_id
WHERE cf.fk1_time_id = 3 AND cf.fk3_crime_id = 1
GROUP BY ld.location_name;
You need to put in the right column names for ld.location_name and ld.location_id.
you need to find a relationship between the two tables to link a location to crime. that way you could use a "join" and select the fields from each table you are interested in.
I suggest taking a step back and reading up on the fundamentals of relational databases. There are many good books out there which is the perfect place to start.