Filter Rows - Pentaho - pentaho

We are Getting inputs from two different tables and passing it to the Filter rows.
But we are getting the below error.
The DATE_ADDED Table has only one column DATE_ADDED and similarly the TODAYS_DATE Table has a single column TODAYS_DATE .
The condition given in the Filter is DATE_ADDED < TODAYS_DATE .
The transaformation is
Can someone tell, where I am doing the mistake

It won't work like this. You expect a join of two streams (like SQL JOIN of two tables) but actually you will have a union (like SQL UNION).
When two streams are intersected on a step they must have identical columns - names, order and types - and the result will be the union of both streams with the same structure as origins.
When you intersect streams with different structures - different column names in your case - you will have unpredictable column names and actually only one column - nothing to compare with.
To do what you need use the Merge Join step (do not forget to sort streams on the joining key)

Both the column names and types should be identical if you wanna merge the columns in single step, right click on both steps and click output fields to verify the datatypes.
if datatype issues arrives OR you want to rename the columns, you can place select step(for each table steps) after table steps and select the DATE Type(in your case)in the Meta-data tab, and rename the fields as well.
Hope this helps... :)

Related

SAS intersect with date values

I have two tables, one that was created using a connection to teradata and other one that was created importing an excel file.
I need to find the records that are on one table but not on the other given three of the fields. The first two are returning what i would expect but when i add the third field which is a date field then none of the records match. Wen i look into the tables i see that the two dates are identical but somehow SAS does not consider them to be identical.
My code looks like this:
PROC SQL;
CREATE TABLE fields_that_do_not_match AS
SELECT /*FIRST TWO FIELDS*/ date_a FROM table_a
EXCEPT
SELECT /*FIRST TWO FIELDS*/ date_b FROM table_b;
QUIT;
Is there something else i should be considering for comparing dates?
When i see the properties of the date fields both are on DATE9. format, are of numerical type and have 8 bytes in length. Both of the dates show 14FEB2022 when i query the table but i don't know if some of the tables have aditional information that is not being displayed due to the format.
Thanks in advance.
The reason why the dates did not match was because one of them had decimal values and i needed to round the values. I added INT(DATE_A) and it worked after that.

How do I combine two tables that have only a few similar columns?

I'm attempting to combine two tables, both of which aren't related in any way except for a few columns (ID, Created Date, Country, etc.). In essence, I simply want to append one table to another. However, I would like to combine the columns that are similar and add on the columns that are not similar. I've attempted a Union, but my tables don't have the same number of columns. Currently, I'm working with this:
SELECT * FROM `leads`, `opportunity`
where `leads`.`Id` = `opportunity`.`Id`
which doesn't really work when I want to use this new query as a subquery elsewhere. Additionally, the fields in each table can change at any time, so I’m never sure which columns are matching or non-matching. I simply want to append the rows from one table onto the other while automatically combining columns with identical names. I feel like I'm missing something obvious...
NOTE: I am doing this within DOMO, so I have a few more limitations than I normally would.
You can use joins
SELECT * FROM `leads` JOIN `opportunity`
on `leads`.`Id` = `opportunity`.`Id`
and to get only selected columns
SELECT leads.column_name, opportunity.column_name FROM `leads` JOIN `opportunity`
on `leads`.`Id` = `opportunity`.`Id`

SQL Exclude a specific column from SQL query result

I have a query where I process columns from two tables and at the end I want ALL colulmns from one temporary table and ONLY ONE column from the other table. Also I do not want the KEY column to appear twice after the join.
I cannot find a clean efficient way to do it. I found these solutions:
Specify all columns explicitly. Bad for obvious reasons if you have to type multiple columns
Get all columns and the DROP the ones you dont need. Not efficient because you carry loads of data and then throwing them away.
Is there a one liner SQL command that leaves out a single column?
Is there an SQL command that removes duplicate KEY column after joining?
Thanks!!
How about selecting all columns from one table and one from the other?
select t1.*, t2.col
from t1 join
t2
on . . .

SQL Select statement from mutiple tables to fill datagridview

I have multiple tables, the tables themselves are named after the date they were created on; so for example 4/01/2021, 5/01/2021.. etc
The tables contain all the same columns.
But I'd like to create a SQL statement that allows me to return all the tables that were created between two dates and fill a Datagridview with all the records in those tables.
Ideally I want a "Created Last Week", "Created This week", "Created This Month" options. I can work out th syntax for the start and end dates. But I'm not sure what the correct way is to return the tables that fall between the dates.
I have looked at a few examples but none seem to work for me or be exactly what I'm after. Not sure if I can use sys.tables or if I need to use inner joins/left join etc to get this to work.
My tables are in a Acccess.MDB file.
You will need a union query:
Use a union query to combine multiple queries into a single result
However, as the tables included will vary, you must create the SQL of the query dynamically and then call the query to fill your datagridview.
Note: This is a terrible setup. You should, at the soonest and as suggested by #June7, change your schema to have one table only with a field holding your dates (your current table names).

SQL Server join and wildcards

I want to get the results of a left join between two tables, with both having a column of the same name, the column on which I join. The following query is seen as valid by the import/export wizard in SQL Server, but it always gives an error. I have some more conditions, so the size wouldn't be too much. We're using SQL Server 2000 iirc and since we're using an externally developed program to interact with the database (except for some information we can't retrieve that way), we can not simply change the column name.
SELECT table1.*, table2.*
FROM table1
LEFT JOIN table2 ON table1.samename = table2.samename
At least, I think the column name is the problem, or am I doing something else wrong?
Do more columns than just your join key have the same name? If only your join key has the same name then simply select one of them since the values will be equivalent except for the non-matching rows (which will be NULL). You will have to enumerate all your other columns from one of the tables though.
SELECT table2.samename,table1.othercolumns,table2.*
FROM table1
LEFT JOIN table2 ON table1.samename = table2.samename
You may need to explicitly list the columns from one of the tables (the one with less fields), and leave out the 2nd instance of what would be the duplicate field..
select Table1.*, {skip the field Table2.sameName} Table2.fld2, Table2.Fld3, Table2.Fld4... from
Since its a common column, it APPEARS its trying to create twice in the result set, thus choking your process.
Since you should never use select *, simply replace it with the column names of the columns you want. THe join column has the same value (or null) in both sides of the join, so only select one of themm the one from table1 which will always have the value.
If you want to select all the columns from both tables just use Select * instead of including the tables separately. That will however leave you with duplicate column names in the result set, so even reading them out by name will not work and reading them by index will give inconsistent results, as changing the columns in the database will change the resultset, breaking any code depending on the ordinals of the columns.
Unfortunately the best solution is to specify exactly the columns you need and create aliases for the duplicates so they are unique.
I quickly get the column headings by setting the query to text mode and copying the top row ...