Group by in impala on customer ID with different account number - impala

CUSTOMER_ID ACCT_NUMBER TAG
CUST_ID ACCT_NO1 NPA
CUST_ID ACCT_NO2 NPA
CUST_ID ACCT_NO3
Table A has above mentioned data.
I need to tag 'NPA' across the new ACCT_NO3.
Conditions:
first grouping the table A on basis of the cust_id.
if the cust_id has two different acct numbers with 'NPA' tag then mark the 3rd one also as 'NPA'.
How can one do this in impala?

Related

How to get date from closest old record for the given customer if a new one appears

I am dealing with policy records table inside Google BigQuery. My table has three columns: INDVDL_ID, POLICY_ID and CNV_DT. Every unique client is identified with INDVDL_ID and may have one or more records in the table depending on how many policies he has. Each policy has it's id (POLICY_ID) and the date that it was bought (CNV_DT). So the code for pulling the data that I am using looks like that:
SELECT INDVDL_ID, POLICY_ID, CNV_DT
FROM `Policy_table.TP_Policies.policy_20191023
What I have:
<table><tbody><tr><th>INDVDL_ID</th><th>POLICY_ID</th><th>CNV_DT</th></tr><tr><td>1</td><td>1</td><td>2008-01-01</td></tr><tr><td>1</td><td>2</td><td>2008-04-31</td></tr><tr><td>1</td><td>3</td><td>2008-12-23</td></tr><tr><td>3</td><td>4</td><td>2009-08-19</td></tr><tr><td>2</td><td>5</td><td>2010-06-12</td></tr><tr><td>2</td><td>6</td><td>2011-11-12</td></tr></tbody></table>
What I would like to pull is table, where for every additional policy that customer has bought I can have a CNV_DT of his prior purchase.
What I would like to have:
<table><tbody><tr><th>INDVDL_ID</th><th>POLICY_ID</th><th>CNV_DT</th><th>PRIOR_CNV_DT</th></tr><tr><td>1</td><td>1</td><td>2008-01-01</td><td> </td></tr><tr><td>1</td><td>2</td><td>2008-04-31</td><td>2008-01-01</td></tr><tr><td>1</td><td>3</td><td>2008-12-23</td><td>2008-04-31</td></tr><tr><td>3</td><td>4</td><td>2009-08-19</td><td> </td></tr><tr><td>2</td><td>5</td><td>2010-06-12</td><td> </td></tr><tr><td>2</td><td>6</td><td>2011-11-12</td><td>2010-06-12</td></tr></tbody></table>
You seem to want lag():
SELECT INDVDL_ID, POLICY_ID, CNV_DT,
LAG(CNV_DT) OVER (PARTITION BY INDVDL_ID ORDER BY CNV_DT) as PRIOR_CNV_DT
FROM `Policy_table.TP_Policies.policy_20191023;

How to get unique customer names those have different IDS

I am working with a table that contains Account_No as unique ID, Customer_Name, Building_Name. The table below is an example:
It can be seen for few cases there are same customer name and same building however different Account_No. I need to remove duplicate names even though they have unique Account_No. Building_Name and Customer_Name are ties together. For example "William----Science City" and "William-----River Club" should be count as two customers since they are residing in different buildings. The result table should look as below;
I need to use SQL for creating the resulting table. Kindly use Customer Table as the reference for SQL query. Thanks
Select Min(Account_No) As Account_No
,Customer_Name,Building_Name
From Customer_Table
Group By Customer_Name, Building_Name

How do I use array_agg with a condition?

I have a table with a list of potential customers, their activity, and their sales representative. Every customer can have up to 1 sales rep. I've built a summary table where I aggregate the customer activity, and group it by the sales rep, and filter by the customer creation date. This is NOT a cohort (the customers do not all correspond to the scheduled_flights, but rather this is a snapshot of activity for a given period of time) It looks something like this:
Now, in addition to the total number of customers, I'd also like to output an array of those actual customers. The customers field is currently calculated by performing sum(is_customer) as customers and then grouping by the sales rep. To build the array, I've tried to do array_agg(customer_name) which outputs the list of all customer names -- I just need the list of names who also satisfy the condition that is_customer = 1, but I can't use that as a where clause since it would filter out other activity, like scheduled and completed flights for customers that were not new.
This should probably work:
array_agg(case when is_customer = 1 then customer_name end) within group (order by customer_name)
Snowflake should ignore NULL values in the aggregation.

How to count how many times certain values appear in a table in SQL and return that number in a column?

I've used the COUNT function to determine how many rows there are in a table or how often a value appears in a table.
However, I want to return the 'count' for multiple values in a table as a seperate column.
Say we a have a customer table with columns; Customer ID #, Name, Phone Number.
Say we also have a sales table with columns: Customer ID, Item Purchased, Date
I would like my query to return a column for customer ID and a column for # of times that customer ID appeared in the sales table. I would like to do this for all of my customer IDs at once--any tips?
You can use group by:
select customer_id,
count(*)
from sales
group by customer_id
This will return a row by customer ID with the count of how many matching items.
You want to use GROUP BY
Select Count(*), CustomerID
from Sales
GROUP BY CustomerID

Simple database for product order

I want to make a simple database to order products, like chips/drinks (as full product, without any specific info about product just name and price for unit)
I designed this but I'm not sure if it's good:
**Person:**
id
username
name
password
phone
email
street
zip
**order:**
id
person_id
product_id
date
quantity (neccessary?)
status (done or not)
**product:**
id
name
price
relations:
[person] 1 --- 1 [order] 1 --- many [product]
(I'm not sure about relations and fields)
It seems that in your way you are going to end up in orders containing a single product (even if you use the quantity)
I would modify the Order table:
**order:**
id
person_id
date
status (done or not)
And I would add a new table:
**OrderDetails**
id
order_id
product_id
quantity
You may check out for db normalization. You should add columns to a table that are directly related to the table. For instance date in the order is valid, because it refers to the order it was made. On the other hand it wouldn't be valid in the person table (unless it was referring to the person join date). So, similarly the quantity refers to the product in the order (thus in OrderDetails) not in the Order or the Product.
You will probably need an intermediate table between order and product, so you can add many times same order to different products