I have a varchar field in my database table A let's call it store_name, this field gets its value from entity A, now entity B enters store_name into a different database table B now I want to get all records in table A where the store_name matches the values in table B.
How would you recommend me doing the query as I don't control the values of those 2 fields?
What do you think about PostgreSQL fuzzystrmatch?
The tables contain thousands of records.
Thanks
Assuming that both table A and table B are in the same database. And I guess since you don't control insertion of data, you are not sure if the values are of same case or there may be a spelling mismatch.
Case 1: If the problem is only of case-mismatch, you can use ilike:
Select a.store_name
from a, b
Where a.store_name ilike b.store_name
Case 2: If you also want to check for spelling mismatch, but words sound similar, then after installing postgresql-contrib package and creating extension fuzzystrmatch, you can use:
Select a.store_name
from a, b
Where a.store_name ilike b.store_name OR
soundex(a.store_name) = soundex(b.store_name)
If you are dealing with names, which may not always be in English, it may be more appropriate to use metaphone or dmetaphone function instead of soundex.
Documentation: Fuzzystrmatch
If you want matching you can use a straight up join.
Select a.store_name
from a
join b on a.store_name = b.store_name;
If you want to use fuzzy matching just use the various functions available in the join criteria. Documentation here
Note: there are some limitations to Fuzzy string matching so i would advise testing each out on values that you either know match or don't.
Related
I am using PostgreSQL and I would like to get row/rows(depending on the query), by giving it a value and searching all of the available columns of my table.
How would I go about checking every column for a value ? I also am looking at checking for different types of values
If the columns are a and b and the text you are searching for is 'test' then something like this will do the trick:
select *
from table1
where 'test' in (a, b);
Note: this will only work if all your columns have the same datatype and the same datatype with the value you search for.
I need to create a column name based on the value of other columns. I need to return a value from a column, but the specific name depends on the value insert on other table.
From intance:
Table A
Column1 | Column2
1 2
Base on that values I need to go to the table B to the column "VE12".
I need this dynamiclly, so the execute(#query) is my last option and I would like to avoid CASE WHEN statments because I have more than 50 options.
My query will be something like:
select case when fn.tab=8 and fo.pais=3 then cp.ve83 end
FROM fn
INNER JOIN fo ON fo.stamp = fn.stamp
INNER JOIN cp
If the value in the column tab is 8 and the value in column pais is 3 I should return the value in column ve83.
Thanks for all the help!
The only sensible option is to go back to the business meaning of the data and redesign the database according to that, instead of according to "technique-oriented abstractions" such as these that SQL was never intended to support.
The main reason for this is that SQL was founded on FIRST order logic, and this precludes supporting stuff like varying domains. Which you are doing (or at least seeking to do) because ve12 could be a DATETIME and ve83 could be a VARCHAR and ve56 coulb be a BLOB etc. etc. So there is just no way for you [or anyone else] to determine the data type of the results in your query, and it is even more impossible to attach meaning to what comes out of your desired query precisely because of this varying-domain and varying-source characteristic.
I'm trying to join two fact tables in a Netezza DB on a common acct_nbr field. In table a, it's BIGINT, and in table b it's coded as VARCHAR. (I have no control over the table design, and I suspect it's set up as VARCHAR because it's populated by web input, and needs to be able to tolerate typos.) I'd like to disregard the alpha characters for the join - I'm willing to rule out all fields in table b that contain non-numeric characters. (The field also contains -,?,!, etc.)
I've tried the following:
A basic join. Throws Bad int8 representation for '9999R99999', I assume based on the first non-convertible VARCHAR entry it comes across.
Using cast/convert on both fields (to BIGINT for b.acct_nbr, to VARCHAR for a.acct_nbr) which I may have implemented incorrectly. Various errors, no results.
Using "select ... from table_a a join table_b b on (a.acct_nbr=b.acct_nbr and b.acct_nbr not like '%[^0-9]%')". I don't seem to be able to make this work, and I haven't found a good explanation for how the '%[]%' syntax works. I know what % does, but I've a poor understanding of how to use the carat and the brackets.
I'm sure this is a simple problem, but I'm banging my head against the wall. Any help is much appreciated!
You can complete the join a few different ways.
select ...
from table_a a join
table_b b on (a.acct_nbr=b.acct_nbr
and translate(b.acct_nbr,'1234567890','') in ('','.','-','-.')
Or if you have the sql functions toolkit installed you could do this.
select ...
from table_a a join
table_b b on
(a.acct_nbr=sql_functions..regexp_extract(b.acct_nbr,'^[0-9]{1,18}')
I have table with 3 columns. One is Id, second column is Name and the third one Description. How can I select the value in the Description field by giving the column index, 3?
Thanks in advance
You can't, from plain SQL (other than in the ORDER BY clause, which won't give you the value but will allow you to sort the result set by it).
If you are using another programming language to construct a dynamic query, you could use that to identify the column being selected by its index number.
Alternatively, you could parameterise your query to return a specific column based on a case statement - like so:
select a, b, c, d, e, ...,
case ?
when 1 then a
when 2 then b
when 3 then c
when 4 then d
when 5 then e
...
end as parameterised_column
from ...
The problem with referring to a column by an index number is that, one day, someone may add a column and break your application as the wrong value will be returned.
This principle is enforced in SQL because you can select named columns, or all columns using the * syntax.
This principle is not enforced in programming languages, where you can usually access the column by ordinal in code, but you should consider the principle before deciding to use a statement such as (psuedo code)
value = results[0].column[2].value;
It should be possible. You'd have to query the system tables (which do vary from one version of SQL to another) to get the 3rd (or Nth) column name as a string to form a following query using that column name.
In SQL 2000 the tables you'll need to start with are syscolumns with a join to sysobjects for the table name. Then the rank() function on "Colid" will give you the Nth column and "name" (shockingly) the name of the column. Once you've got that in a variable the following command can return the value, compare to it, order by it or whatever you need.
This is how you can retrieve a Column's name by passing it's index.
Here variable AcID is used as the index of the column.
Below is the code e.g
dim gFld as string
vSqlText1 = "Select * from RecMast where ID = 1000"
vSql1 = New SqlClient.SqlCommand(vSqlText1, cnnRice)
vRs1 = vSql1.ExecuteReader
if vRs1.Read then
gFld = vRs1.GetName(AcID)
msgbox gfld
end if
declare #searchIndex int
set #searchIndex = 3
select Description from tbl_name t where t.Id = #searchIndex
I'm trying to create a sql query, but there is this error:
Ambiguous column name 'description'.
Its because this column occurs in both tables.
if I remove the description from the query, it works.
I tried to rename the description-field "AS description_pointer", but the error still occurs.
SELECT TOP 1000 [activityid]
,[activitytypecodename]
,[subject]
,[regardingobjectid]
,[contactid]
,[new_crmid]
,[description] AS description_pointer
FROM [crmtestext_MSCRM].[dbo].[FilteredActivityPointer] as I
Left JOIN [crmtestext_MSCRM].[dbo].[FilteredContact]
ON I.[regardingobjectid] = [crmtestext_MSCRM].[dbo].[FilteredContact].[contactid]
WHERE new_crmid not like '%Null%' AND activitytypecodename like '%E-mail%'
Both tables coming into play in the query have a column named description. You RDBMS cannot guess which column table you actually want.
You need to prefix the column name with the table name (or table alias) to disambiguate it.
Bottom line, it is a good practice to always prefix column names with table names or aliases as soon as several tables come into play in a query. This avoids the issue that you are seeing here and make the queries easier to understand for the poor souls that have no knowledge of the underlying schema.
Here is an updated version of your query with table aliases and column prefixes. Obviously you need to review each column to put the correct alias:
SELECT TOP 1000
i.[activityid]
,i.[activitytypecodename]
,i.[subject]
,c.[regardingobjectid]
,c.[contactid]
,c.[new_crmid]
,c.[description] AS description_pointer
FROM [crmtestext_MSCRM].[dbo].[FilteredActivityPointer] as i
Left JOIN [crmtestext_MSCRM].[dbo].[FilteredContact] as c
ON i.[regardingobjectid] = c.[contactid]
WHERE i.new_crmid not like '%Null%' AND i.activitytypecodename like '%E-mail%'