How do you JOIN tables to a view using a Vertica DB?

How do you JOIN tables to a view using a Vertica DB? - sql

Good morning/afternoon! I was hoping someone could help me out with something that probably should be very simple.
Admittedly, I’m not the strongest SQL query designer. That said, I’ve spent a couple hours beating my head against my keyboard trying to get a seemingly simple three way join working.
NOTE: I'm querying a Vertica DB.
Here is my query:
SELECT A.CaseOriginalProductNumber, A.CaseCreatedDate, A.CaseNumber, B.BU2_Key as BusinessUnit, C.product_number_desc as ModelNumber
FROM pps_sfdc.v_Case A
INNER JOIN reference_data.DIM_PRODUCT_LINE_HIERARCHY B
ON B.PL_Key = A.CaseOriginalProductLine
INNER JOIN reference_data.DIM_PRODUCT C
ON C.product_line_code = A.CaseOriginalProductLine
WHERE B.BU2_Key = 'XWT'
LIMIT 20
I have a view (v_Case) that I’m trying to join to two other tables so I can lookup a value from each of them. The above query returns identical data on everything EXCEPT the last column (see below). It's like it's iterating through the last column to pull out the unique entries, sort of like a "GROUP BY" clause. What SHOULD be happening is that I get unique rows with specific "BusinessUnit" and "ModelNumber" for that record.
DUMEPRINT 5/2/2014 8:56:27 AM 3002845327 JJT Product 1
DUMEPRINT 5/2/2014 8:56:27 AM 3002845327 JJT Product 2
DUMEPRINT 5/2/2014 8:56:27 AM 3002845327 JJT Product 3
DUMEPRINT 5/2/2014 8:56:27 AM 3002845327 JJT Product 4
I modeled my solution after this post:
How to deal with multiple lookup tables for beginners of SQL?
What am I doing wrong?
Thank you for any help you can provide.

Data issue. General rule in trouble shooting these is the column that is distinct (in this case C.product_number_desc as ModelNumber) for each record is generally where the issue is going to be...and why I pointed you towards dim_product.
If you receive duplicates, this query below will help identify if this table is giving you the issues. Remember key in this statement can be multiple fields...whatever you are joining the table on:
Select key,count(1) from table group by key having count(1)>1
Other options for the future...don't assume it's your code, duplicates like this almost always point towards dirty data (other option is you are causing cross joins because keys are not correct). If you comment out the 'c' table and the column referred to in the select clause, you would have received one row...hence your dupes were coming from the 'c' table here.
Good luck with it

Related

Return query results where two fields are different (Access 2010)

I'm working in a large access database (Access 2010) and am trying to return records where two locations are different.
In my case, I have a large number of birds that have been observed on multiple dates and potentially on different islands. Each bird has a unique BirdID and also an actual physical identifier (unfortunately that may have changed over time). [I'm going to try addressing the changing physical identifier issue later]. I currently want to query individual birds where one or more of their observations is different than the "IslandAlpha" (the first island where they were observed). Something along the lines of a criteria for BirdID: WHERE IslandID [doesn't equal] IslandAlpha.
I then need a separate query to find where all observations DO equal where they were first observed. So where IslandID = IslandAlpha
I'm new to Access, so let me know if you need more information on how my tables/relationships are set up! Thanks in advance.

Assuming the following tables:
Birds table in which all individual birds have records with a unique BirdID and IslandAlpha.
Sightings table in which individual sightings are recorded, including IslandID.
Your first query would look something like this:
SELECT *
FROM Birds
INNER JOIN Sightings ON Birds.BirdID=Sightings.BirdID
WHERE Sightings.IslandID <> Birds.IslandAlpha
You second query would be the same but with = instead of <> in the WHERE clause.
Please provide us information about the tables and columns you are using.

I will presume you are asking this question because a simple join of tables and filtering where IslandAlpha <> ObsLoc is not possible because IslandAlpha is derived from first observation record for each bird. Pulling first observation record for each bird requires a nested query. Need a unique record identifier in Observations - autonumber should serve. Assuming there is an observation date/time field, consider:
SELECT * FROM Observations WHERE ObsID IN
(SELECT TOP 1 ObsID FROM Observations AS Dupe
WHERE Dupe.ObsBirdID = Observations.ObsBirdID ORDER BY Dupe.ObsDateTime);
Now use that query for subsequent queries.
SELECT * FROM Observations
INNER JOIN Query1 ON Observations.ObsBirdID = Query1.ObsBirdID
WHERE Observations.ObsLocID <> Query1.ObsLocID;

sql combine 2 rows into one

I have the following dataset:
I am trying to convert the table on the left into the table on the right. I have several duplicates of orders with the same name but different products sold. I would like to combine the rows so it shows just one orderID. I've tried joining the table to itself based on order but I must be doing something wrong. Do you guys have any suggestions? this is probably super easy but I am not proficient with SQL yet. Thank you in advance.

If there is at most one value in each column, you can use group by:
select order, name, max(product1) as product1, max(product2) as product2,
max(product3) as product3
from lefttable
group by order, name;
That said, I suspect that the table on the left is the result of a query on the data. You probably simply need the right aggregation for that query.
Also, if you have more than one value in any column for an order, you can still do this, but the query is a bit more complicated.

Linking Three Tables together

I'm creating an archive for Academic Papers. Each paper may have one author, or multiple authors. I've created the tables in the following manner:
Table 1: PaperInfo - Each row contains information on the paper
Table 2: PaperAuthor - Only Two Columns: contains PaperID, and AuthorID
Table 3: AuthorList - Contains Author Information.
There is also a Table 4 which is linked to Table 4, which contains a list of Universities which the author belongs to, but I'm going to leave it out for now in case it gets too complicated.
I wish to have a Query which will link all three tables together, and display Paper Information of the recordset in a table, with columns such as these:
Paper Title
Paper Authors
The column "Paper Authors" is going to contain more than one authors in some cases.
I've wrote the following query:
SELECT a.*,b.*,c.*
FROM PaperInfo a, PaperAuthor b, AuthorList c
WHERE a.PaperID = b.PaperID AND b.AuthorID = c.AuthorID
So far, the results I've been getting for each row is one author per row. I wish to contain more authors in one column. Can this be done in anyway?
Note: I'm using Access 2010 as my database.

In straight SQL the answer unfortunately is that it isn't possible. You would need to use a processing language in order to get the result you are after.

Since you mention you are using Access 2010 please refer to this question: is there a group_concat function in ms-access?
Particularly, read the post which points to http://www.rogersaccesslibrary.com/forum/generic-function-to-concatenate-child-records_topic16&SID=453fabc6-b3z9-34z6zb14-a78f832z-19z89a2c.html
You probably need to implement a custom function but the 2nd url does what you are looking for.

This functionality is not part of the SQL standard, but different vendors have solutions for it, see for instance Pivot Table with many to many table, MySQL pivot table.

If you know the maximum number of authors per paper (for example 3 or 4), you could get away with a triple or quadruple left join.

What you are after is an inner join.
An SQL JOIN clause is used to combine rows from two or more tables, based on a common field between them.
The most common type of join is: SQL INNER JOIN (simple join). An SQL INNER JOIN return all rows from multiple tables where the join
condition is met.
http://www.w3schools.com/sql/sql_join.asp
You may want to combine the inner join with a group to give you 1 paper to many authors in your results.
The GROUP BY statement is used in conjunction with the aggregate
functions to group the result-set by one or more columns.
http://www.w3schools.com/sql/sql_groupby.asp

SQL Server query returns too many records

I have 3 tables in my SQL Server database.
They are linked together as shown in this picture (lines are connected to right rows in picture)
I have a query which should return all the reparations from tblreparations with some information about what is repaired, but instead it returns the reparation 3 times, one time for each laptop that the client (klant in dutch) has assigned to it, while the reparations table (reparaties in dutch) only contains one laptopID each row
This is the query:
SELECT AankopenReparaties.Id,
AankopenReparaties.KlantenId,
AankopenReparaties.actietype,
AankopenReparaties.voorwerptype,
laptopscomputers.merk,
laptopscomputers.model,
laptopscomputers.info,
AankopenReparaties.info,
AankopenReparaties.Prijs,
AankopenReparaties.lopend
FROM AankopenReparaties, laptopscomputers
WHERE (aankopenreparaties.lopend = 'lopend');
It returns this
and it should be only one row since the reparations table (aankopenreparaties) only contains one row with one laptopID
Does anyone know how to fix this?
Please help because it should be fixed soon (it's an assignment for school)

The reason why you are returning too many records is because your query produces cartesian product of both tables. You need to tell the server on how the two tables are related with each other.
SELECT AankopenReparaties.Id,
AankopenReparaties.KlantenId,
AankopenReparaties.actietype,
AankopenReparaties.voorwerptype,
laptopscomputers.merk,
laptopscomputers.model,
laptopscomputers.info,
AankopenReparaties.info,
AankopenReparaties.Prijs,
AankopenReparaties.lopend
FROM AankopenReparaties
INNER JOIN laptopscomputers
ON AankopenReparaties.LaptopID = laptopscomputers.ID -- specify relationship
WHERE aankopenreparaties.lopend = 'lopend'
To further gain more knowledge about joins, kindly visit the link below:
Visual Representation of SQL Joins
INNER JOIN ON vs WHERE clause

Using VBA to get the sum of values based on criteria from other tables?

I need to find the sum of the prices of a number of products, however the prices are stored in a different table to products that need pricing.
But, there is a catch, it needs to select these items based on criteria from a third table too.
So, I need the sum of the price of all products in Table 1 where CutID in Table 2 = 001.
Table 1 and Table 2 are linked on SCID, one to many respectively.
If this makes no sense tell me and I will try to clarify?
Thanks,
Bob P

Based on your question, I don't think there's a need for VBA. Excel formulas should be sufficient.
Add a few columns to your primary table. In these columns, use vlookup() to get all your information in one place, including the criteria.
If you only need to sum based on one criteria, use sumif(). If there's multiple criteria, use sumproduct().

Generally, with Access, I initially try to work with something as close a possible to a standard SQL query for ease of maintenance and portability. This ran for me in Access 2010:
SELECT Products.ProductID, Sum(Prices.Price) AS PriceSum
FROM Prices INNER JOIN (Critera INNER JOIN Products ON Critera.SCID = Products.SCID) ON Prices.ProductID = Products.ProductID
WHERE Critera.CutID="001"
GROUP BY Products.ProductID;
Please let us know if that works with your data (I'm not sure of your column names, either).

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas