adding a UUID4 ID to each row in Pentaho - pentaho

I am inserting the result of a SELECT statement from a relational table into another table using Pentaho, is it possible to add a UUID4 identifier to each row and then insert.
Data before insertion :
ip country city start_time
1.7411624393E10 Canada London 2017-06-01 15:27:23
1.7411221531E10 Canada Ottawa 2017-06-02 23:57:56
1.846525287E9 Canada Langley 2017-06-02 22:27:29
2.0647254234E10 Canada Toronto 2017-06-02 22:22:49
2.0647254234E10 Canada Toronto 2017-06-02 22:22:12
2.0647254234E10 Canada Toronto 2017-06-02 22:21:20
Needed as :
UUID ip country city start_time
ID1 1.7411624393E10 Canada London 2017-06-01 15:27:23
ID2 1.7411221531E10 Canada Ottawa 2017-06-02 23:57:56
ID3 1.846525287E9 Canada Langley 2017-06-02 22:27:29
ID4 2.0647254234E10 Canada Toronto 2017-06-02 22:22:49
ID5 2.0647254234E10 Canada Toronto 2017-06-02 22:22:12
ID6 2.0647254234E10 Canada Toronto 2017-06-02 22:21:20
I am able to generate one UUID4 ID using random generator for all the records, but I need to generate ofcourse separate UUIDs for all the rows.

You can use "Generate random value" step to create a column with a type "Universally Unique Identifier type 4(UUID4)".

Related

Gather the number of customer by street

I have two tables :
Customer:
id
name
address_id
1
John
4
2
Kate
5
3
Bob
2
4
Michael
2
5
Adriana
3
6
Ann
1
Address:
id
detail_str_name
city
district
street_name
1
France,Paris,str.2,N5
Paris
Paris
str.2
2
France,Parise,str.2 ,N3
Paris
Paris
str.2
3
France, Lille ,str.3,N4
Lille
Lille
str.3
4
France,Paris,str.4,N3
Paris
Paris
str.4
5
France, Paris, Batignolles,N4
Paris
Batignolles
Batignolles
I want table like this:
name
detail_str_name
city
district
street_name
sum(cu.num_cust)
John
France,Paris,str.4,N3
Paris
Paris
str.4
1
Kate
France, Paris, Batignolles,N4
Paris
Batignolles
Batignolles
1
Bob
France,Parise,str.2 ,N3
Paris
Paris
str.2
3
Michael
France,Parise,str.2 ,N3
Paris
Paris
str.2
3
Adriana
France, Lille ,str.3,N4
Lille
Lille
str.3
1
Ann
France,Paris,str.2,N5
Paris
Paris
str.2
3
I want to count customer group by city,district and street_name, not detail_str_name.
I try:
select cu..name,ad.detail_str_name, ad.city,ad.district, ad.street_name,sum(cu.num_cust)
from
(select address_id, name,count (id) as num_cust
from customer
group by address_id,name) cu
left join address ad on cu.address_id = ad.id
group by cu..name,ad.detail_str_name, ad.city,ad.district, ad.street_name
But,this code groups by detail_str_name,
Which does not suit me.
What can I change?
I haven't been able to check this so it might not be totally correct but I think the query below should get the data you require.
This SQLTutorial article on the partition by clause might be useful.
SELECT cu.name,
ad.detail_str_name,
ad.city,
ad.district,
ad.street_name,
COUNT(cu.name) OVER(PARTITION BY ad.city, ad.district, ad.street_name) AS 'num_cust'
FROM customer cu
JOIN address ad ON ad.id = cu.address_id

In Oracle SQL, Add max values (row by row) from another table when other columns of the table are already populated

I have two tables A and B. Table B has 4 columns(ID,NAME,CITY,COUNTRY), 3 columns has values and one column (ID) has NULLS. I want to insert max value from table A column ID to table B where the ID field in B should be in increasing order.
Screenshot
TABLE A
ID NAME
------- -------
231 Bred
134 Mick
133 Tom
233 Helly
232 Kathy
TABLE B
ID NAME CITY COUNTRY
------- ------- ---------- -----------
(NULL) Alex NY USA
(NULL) Jon TOKYO JAPAN
(NULL) Jeff TORONTO CANADA
(NULL) Jerry PARIS FRANCE
(NULL) Vicky LONDON ENGLAND
ID in column in B should be populated as MAX(ID) +1 from table A. The output should look like this:
TABLE B
ID NAME CITY COUNTRY
------ -------- ---------- -----------
234 Alex NY USA
235 Jon TOKYO JAPAN
236 Jeff TORONTO CANADA
237 Jerry PARIS FRANCE
238 Vicky LONDON ENGLAND
Perhaps the simplest method is to create a one-time sequence for the update:
create sequence temp_b_seq;
update b
set id = (select max(id) from a) + temp_b_seq.nextval;
drop sequence temp_b_seq;
You could actually initialize the sequence with the maximum value from a, but that requires dynamic SQL, so this seems like the simplest approach. Oracle should be smart enough to run the subquery only once.

Presenting Data uniformly between two different table presentations with SQL

Hello Everyone I have a problem…
Table 1 (sorted) is laid out like this:
User ID Producer ID Company Number
JWROSE 23401 234
KXPEAR 23903 239
LMWEEM 27902 279
KJMORS 18301 183
Table 2 (unsorted) looks like this:
Client Name City Company Number
Rajat Smith London JWROSE
Robert Singh Cleveland KXPEAR
Alberto Johnson New York City LMWEEM
Betty Lee Dallas KJMORS
Chase Galvez Houston 23401
Hassan Jackson Seattle 23903
Tooti Fruity Boise 27902
Joe Trump Tokyo 18301
Donald Biden Cairo 234
Mike Harris Rome 239
Kamala Pence Moscow 279
Adolf Washington Bangkok 183
Now… Table 1 has all of the User IDs and Producer IDs properly rowed with the Company Number.
I want to pull all the data and correctly sorted….
Client Name City User ID Producer ID Company Number
Rajat Smith London JWROSE 23401 234
Robert Singh Cleveland KXPEAR 23903 239
Alberto Johnson New York City LMWEEM 27902 279
Betty Lee Dallas KJMORS 18301 183
Chase Galvez Houston JWROSE 23401 234
Hassan Jackson Seattle KXPEAR 23903 239
Tooti Fruity Boise LMWEEM 27902 279
Joe Trump Tokyo KJMORS 18301 183
Donald Biden Cairo JWROSE 23401 234
Mike Harris Rome KXPEAR 23903 239
Kamala Pence Moscow LMWEEM 27902 279
Adolf Washington Bangkok KJMORS 18301 183
Query:
Select
b.client_name,
b.city.,
a.user_id,
a.producer_id,
a.company_number
From Table 1 A
Left Join Table 2 B On a.company….
And this is where I don’t know what do to….because both tables have all the same variables, but Company Number in Table 2 is mixed with User IDs and Producer IDs... however we know what company Number those ID's are associated to.
As I mention in the comments, and others do, the real problem is your design. "The fact that UserID is clearly a varchar, while the other 2 columns are an int really does not make this any better", and makes this not simple (and certainly not SARGable).
To get the data in the correct order, as well, you need a column to order it on which the data lacks. I have therefore added a pseudo column, MissingIDColumn, to represent this missing column you need to add to your data; which you can do when you fix the design:
SELECT T2.ClientName,
T2.City,
T1.UserID,
T1.ProducerID,
T1.CompanyNumber
FROM (VALUES('JWROSE',23401,234),
('KXPEAR',23903,239),
('LMWEEM',27902,279),
('KJMORS',18301,183))T1(UserID,ProducerID,CompanyNumber)
JOIN (VALUES(1,'Rajat Smith ','London ','JWROSE'),
(2,'Robert Singh ','Cleveland ','KXPEAR'),
(3,'Alberto Johnson ','New York City','LMWEEM'),
(4,'Betty Lee ','Dallas ','KJMORS'),
(5,'Chase Galvez ','Houston ','23401'),
(6,'Hassan Jackson ','Seattle ','23903'),
(7,'Tooti Fruity ','Boise ','27902'),
(8,'Joe Trump ','Tokyo ','18301'),
(9,'Donald Biden ','Cairo ','234'),
(10,'Mike Harris ','Rome ','239'),
(11,'Kamala Pence ','Moscow ','279'),
(12,'Adolf Washington','Bangkok ','183'))T2(MissingIDColumn,ClientName,City,CompanyNumber) ON T2.CompanyNumber IN (T1.UserID,CONVERT(varchar(6),T1.ProducerID),CONVERT(varchar(6),T1.CompanyNumber))
ORDER BY MissingIDColumn;

Unable to Display Timestamp with Select query on table

I have table 'weatherdata' with 3 fields.
CREATE TABLE weatherdata( value` string, snapshort_time timestamp)
PARTITIONED BY ( country string)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION 'hdfs://quickstart.cloudera:8020/user/hive/warehouse/expl.db‌​/weatherdata'
When I write command given below on this table it successfully shows the 'Snapshort_time' field in the end in Output :
hive (expl)> select * from weatherdata;
OK
{"location: {"name":"Beijing","region":"Beijing","country":"China","lat":39.93,"lon":116.39,"tz_id":"Asia/Shanghai","localtime_epoch":1486857803,"localtime":"2017-02-12 0:03"},"current":{"last_updated_epoch":1486857803,"last_updated":"2017-02-12 00:03","temp_c":-3.0,"temp_f":26.6,"is_day":0,"condition":{"text":"Clear","icon":"//cdn.apixu.com/weather/64x64/night/113.png","code":1000},"wind_mph":0.0,"wind_kph":0.0,"wind_degree":0,"wind_dir":"N","pressure_mb":1028.0,"pressure_in":30.8,"precip_mm":0.0,"precip_in":0.0,"humidity":39,"cloud":0,"feelslike_c":-3.0,"feelslike_f":26.6}} NULL 2017-02-11 08:22:36
The 'null' shown in Output value is for 'country' field.
But when I given the Select below the 'Snapshort_time' shows 'null'.
select get_json_object(value, '$.location.name') AS name,
get_json_object(value,'$.location.region') AS region,
get_json_object(value, '$.location.country') AS country,
get_json_object(value, '$.current.condition.text') AS text,
get_json_object(value, '$.current.feelslike_c') AS feelslike_c,
Snapshort_time
from weatherdata;
Here is the Output:
OK
Beijing Beijing China Clear -3.0 NULL
Dubai Dubai United Arab Emirates Clear 24.6 NULL
London City of London, Greater London United Kingdom Mist -1.3 NULL
Moscow Moscow City Russia Clear -5.7 NULL
Paris Ile-de-France France Patchy light snow -1.2 NULL
Sydney New South Wales Australia Partly cloudy 26.3 NULL
Tokyo Tōkyō Japan Partly cloudy -1.3 NULL
Toronto Ontario Canada Overcast -4.0 NULL
Washington District of Columbia United States of America Partly cloudy 5.9 NULL
Time taken: 0.339 seconds, Fetched: 9 row(s)
What must be the reason?

SQL logic for getting records in a single row for a unique id

![Cognost reports studio Query Explorer]
Below is the snapshot of a table.
**Acctno ClientNo ClientName PrimaryOffId SecondaryOffID**
101 11111 ABC corp 3 Not Defined
102 11116 XYZ Inc 5 Not Defined
103 11113 PQRS Corp 2 9
104 55555 Food LLC 4 11
105 99999 Kwlg Co 1 Not Defined
106 99999 Kwlg Co 1 Not Defined
107 11112 LMN Corp Not Defined 6
108 11112 LMN Corp Not Defined 6
109 11115 Sleep Co 4 10
110 44444 Cool Co Not Defined 8
111 11114 Sail LLC 3 Not Defined
112 66666 Fun Inc 1 Not Defined
113 88888 Job LLC 5 12
114 22222 Acc Co Not Defined Not Defined
115 77777 Good Corp 2 Not Defined
116 33333 City LLC Not Defined 7
117 33333 City LLC Not Defined 7
118 33333 City LLC Not Defined 7
119 11111 ABC corp 3 Not Defined
I want to replace PrimaryOffID and SecondaryOffID with their Names coming from this table
EmpID Names
1 Cathy
2 Chris
3 John
4 Kevin
5 Mark
6 Celine
7 Jane
8 Phil
9 Jess
10 Jose
11 Nick
12 Rosy
The Result should look like this: Notice that, If Cathy is the PrimaryOfficer, she can't be the Secondary Officer and vice versa. This logic is applicable for all the Names
Acctno ClientNo Client Name PrimOffName SecondaryOffName
101 11111 ABC corp John Not Defined
102 11116 XYZ Inc Mark Not Defined
103 11113 PQRS Corp Chris Jess
104 55555 Food LLC Kevin Nick
105 99999 Kwlg Co Cathy Not Defined
106 99999 Kwlg Co Cathy Not Defined
107 11112 LMN Corp Not Defined Celine
108 11112 LMN Corp Not Defined Celine
109 11115 Sleep Co Kevin Jose
110 44444 Cool Co Not Defined Phil
111 11114 Sail LLC John Not Defined
112 66666 Fun Inc Cathy Not Defined
113 88888 Job LLC Mark Rosy
114 22222 Acc Co Not Defined Not Defined
115 77777 Good Corp Chris Not Defined
116 33333 City LLC Not Defined Jane
117 33333 City LLC Not Defined Jane
118 33333 City LLC Not Defined Jane
119 11111 ABC corp John Not Defined
But Instead it looks like this:
Acctno ClientNo ClientName PrimOffName SecondaryOffName
101 11111 ABC corp John Not Defined
102 11116 XYZ Inc Mark Not Defined
103 11113 PQRS Corp Chris Not Defined
103 11113 PQRS Corp Not Defined Jess
104 55555 Food LLC Kevin Not Defined
104 55555 Food LLC Not Defined Nick
105 99999 Kwlg Co Cathy Not Defined
106 99999 Kwlg Co Cathy Not Defined
107 11112 LMN Corp Not Defined Celine
108 11112 LMN Corp Not Defined Celine
109 11115 Sleep Co Kevin Not Defined
109 11115 Sleep Co Not Defined Jose
110 44444 Cool Co Not Defined Phil
111 11114 Sail LLC John Not Defined
112 66666 Fun Inc Cathy Not Defined
113 88888 Job LLC Mark Not Defined
113 88888 Job LLC Not Defined Rosy
114 22222 Acc Co Not Defined Not Defined
115 77777 Good Corp Chris Not Defined
116 33333 City LLC Not Defined jane
117 33333 City LLC Not Defined jane
118 33333 City LLC Not Defined jane
119 11111 ABC corp John Not Defined
Notice that, now the Acctno is no more unique, Where ever the Names should have been in both the fields together, it separates and gives the output in the next row creating multiple records. i tried various options but it didn't work. Please be aware, that I am creating this report in Cognos Studio. Please suggest the possible query to get the desired result. Thanks in Advance. Appreciate your help.
You don't state which version of Cognos you're using. "Cognos Studio" is ambiguous. I'm most familiar with 8.4.1, but even then you don't say if you're trying to define this in the Cognos model, Query Studio, Event Studio or Report Studio.
Second, you should always show what you've got so far when asking questions on StackOverflow. People want to see what you have done to show you want to fix, not repeat the lion's share of the work. That's why you got downvotes.
As far as plain SQL, you'll want to do this:
SELECT a.Acctno, a.ClientNo, a.ClientName, coalesce(e1.Names,'Not Defined') "PrimaryOffName", coalesce(e2.Names,'Not Defined') "SecondaryOffName"
FROM Account a
LEFT OUTER JOIN Emp e1
ON t.PrimaryOffID = e1.EmpID
LEFT OUTER JOIN Emp e2
ON t.PrimaryOffID = e2.EmpID
I made up table names. You can do this in Report Studio by creating two queries for Emp and outer joining them in succession to the Account query.
If you're able to, you'll want to move the OffID fields to a separate juntion table and remove them from the Account table. You can then create a Status field or flag in that junction table that identifies primary and secondary.