How do I get rid of this Error:Column alias list has 1 entries but 't' has 4 columns available while creating a view on top of a view in AWS athena? - sql

I have created a "dummy" table to access records stored in parquet in S3 & I have created a "dummy_flattened" view on top of "dummy" table using this query
CREATE OR REPLACE VIEW dummy_flattened AS
select
deviceid
, a.postid
, a.craterid
, a.posttype
, a.category
FROM dummy cross join UNNEST(postlist) as t(a)
example of a postlist row :
[{category=[{id=1, name=Dance}], craterid=1, postid=1, posttype=a},
{category=[{2, name=Dance}], craterid=2, postid=2, posttype=b}]
Now when I try to create a view on top of dummy_flattened view using the query
create or replace view test as select distinct deviceid from dummy_flattened limit 100
Athena is giving me the error :
Failed analyzing stored view 'logs.dummy_flattened': line
19:12: Column alias list has 1 entries but 't' has 4 columns available
but i can view the results with the same query mentioned above
select distinct deviceid from dummy_flattened limit 100
However when i don't include a.postid,a.craterid...while creating the dummy_flattened i can successfully create test view.
However I couldn't figure out how to create a view on top of flattened view that includes postid,craterid,postype..without running in to a error.
I have been trying multiple combinations for the past few hours with no luck.
A few of the combinations i tried while creating flattened view
1. select deviceid,t.* as (a) FROM dummy cross join UNNEST(postlist) as t
2. select deviceid,(t.a.*) as (c,d,e,f) FROM dummy cross join UNNEST(postlist) as t(a)
3. And the one i mentioned above at the beginning
I would like to know if i'm missing something or can we not create views on top of unnested records. I have been refferring to multiple documents and other similar questions in stack overflow but couldn't find anything. As a last resort i'm posting this

The version I'm using right now is Athena 3 and that's where the problem is. I'm not sure why but there's no error when I switched to Athena version 2. Might be a bug or something.

Docs mention that every column of the row in the array is expanded into separate a column:
UNNEST can be used in combination with an ARRAY of ROW structures for expanding each field of the ROW into a corresponding column
So you need to specify corresponding number of columns:
CREATE OR REPLACE VIEW dummy_flattened AS
select
deviceid
, postid
, craterid
, posttype
, category
FROM dummy,
UNNEST(postlist) as t(category, craterid, postid, posttype)

Related

How to retrieve ambiguous columns from a hive table using subquery?

Main table:
CREATE EXTERNAL TABLE user(language STRING,snapshot_time STRING,products STRUCT<id:STRING,name:STRING>,item STRUCT<quantity:ARRAY<STRUCT<name:STRING>>>)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
STORED AS TEXTFILE
LOCATION '/user/input/sample';
I'm trying to insert ambiguous column names "product.name","A.name" into user_prod_info table. Since, the column names are same, I'm facing Ambiguous column reference text in q error.
Insert command:
INSERT OVERWRITE TABLE user_prod_info
SELECT q.* FROM (
SELECT row_number() OVER (PARTITION BY products.id ORDER BY snapshot_time DESC) AS temp_row_num,
language,
snapshot_time,
products.id,
products.name,
A.name
FROM user as raw
LATERAL VIEW EXPLODE(item.quantity) quantity as A
) q WHERE temp_row_num == 1;
This command is unable to retrieve the field from the specific table because we have two "name" fields. one is in "products" and the other is in "A".
I tried creating alias for "A.name as name1". I'm able to insert the data without errors. But, one record is storing in 3 rows with some nulls in it.
I got stuck over here. Can anyone please help me out regarding this...

ORA-01792: maximum number of columns in a table or view is 1000 error while using WITH in sql

I have a query :
WITH abc AS
(
(SELECT SRC_DATA.*,
(SELECT MAX(DECODE(OBJ.AUD_ACTION_FLAG,'D',OBJ.OUPDATE_COUNT,OBJ.NUPDATE_COUNT))
FROM SMARTTRIAL_ODR_LANDING.AUD_TRIAL_DESIGN OBJ
WHERE OBJ.AUD_DATE_CHANGED BETWEEN TO_DATE('01-JAN-1900') AND (SRC_DATA.AUD_DATE_CHANGED)
AND DECODE(OBJ.AUD_ACTION_FLAG,'D',OBJ.OTRIAL_NO,OBJ.NTRIAL_NO)= DECODE(SRC_DATA.AUD_ACTION_FLAG,'D',SRC_DATA.OTRIAL_NO,SRC_DATA.NTRIAL_NO)
AND OBJ.AUD_ACTION_FLAG <> 'D'
) UPDATE_COUNT,
/***Multiple select statement like above with many other look up tables like AUD_TRIAL_DESIGN ****/
FROM SMARTTRIAL_ODR_LANDING.AUD_TRIAL SRC_DATA /***AUD_TRIAL is the base table***/
),
WITH def AS
(SELECT OBJ_DATA .*,
/***Similar statement as mentioned in above block and lookup table is AUD_OBJECTIVE***/
FROM SMARTTRIAL_ODR_LANDING.AUD_TRIAL_OBJECTIVE OBJ_DATA /***AUD_TRIAL_OBJECTIVE is the base table***/
)
----Query to select columns-----
FROM abc
LEFT JOIN def
LEFT JOIN xyz ON (column from def = column from xyz)
For the simliar structure of query written by me, following error is returned :
ORA-01792: maximum number of columns in a table or view is 1000
01792. 00000 - "maximum number of columns in a table or view is 1000"
*Cause: An attempt was made to create a table or view with more than 1000
columns, or to add more columns to a table or view which pushes
it over the maximum allowable limit of 1000. Note that unused
columns in the table are counted toward the 1000 column limit.
*Action: If the error is a result of a CREATE command, then reduce the
number of columns in the command and resubmit. If the error is
a result of an ALTER TABLE command, then there are two options:
1) If the table contained unused columns, remove them by executing
ALTER TABLE DROP UNUSED COLUMNS before adding new columns;
2) Reduce the number of columns in the command and resubmit.
Could anyone please suggest a solution
We had a similar problem (Here is an excerpt from the SR):
Creating view generates ORA-01792 maximum number of columns in a table or view is 1000
We have a new application that has a view that contains 35 columns. However, when creating it, it errors out stating that there are over 1000 columns, which is false. I will attach the view definition
Here is what Oracle said (and it did fix the problem):
Bug 19893041 : ORA-01792 HAPPEN WHEN UPDATE TO 12.1.0.2
closed as dup of
Bug 19509982 : DISABLE FIX FOR RAISING ORA-1792 BY DEFAULT.
Solution:
SQL> alter system set "_fix_control"='17376322:OFF';
Or
B. Apply patch 19509982
(no conflicts found with the attached opatch)
That may be the same issue you're encountering.

Validate Data in SQL Server Table

I am trying to validate the data present in SQL Server table using a stored procedure.
In one of the validation rules, i have to check whether the value of a particular column is present in another table.
Suppose i have a staging table with following columns Cat_ID, Amount, SRC_CDE
I have a 'maintable' with following columns CatID , Cat_Name
I have to validate whether the Cat_ID present in staging table exists in the 'maintable' for each row
I am using the following statement to validate
if((Select count(*) from maintable where CatID= #Cat_id) >0 )
-- Do something if data present
I want to know if there is any better way of doing the above thing other than using a select query for every row.
Can i use some sort of an array where i can fetch all the CatID from maintable and the check instead of using a select query.
Thanks
Using a left join to list all the invalid rows.
select
staging.*
from
staging
left join maintable
on staging.catid=maintable.catid
where maintable.catid is null

SQL: I need to take two fields I get as a result of a SELECT COUNT statement and populate a temp table with them

So I have a table which has a bunch of information and a bunch of records. But there will be one field in particular I care about, in this case #BegAttField# where only a subset of records have it populated. Many of them have the same value as one another as well.
What I need to do is get a count (minus 1) of all duplicates, then populate the first record in the bunch with that count value in a new field. I have another field I call BegProd that will match #BegAttField# for each "first" record.
I'm just stuck as to how to make this happen. I may have been on the right path, but who knows. The SELECT statement gets me two fields and as many records as their are unique #BegAttField#'s. But once I have them, I haven't been able to work with them.
Here's my whole set of code, trying to use a temporary table and SELECT INTO to try and populate it. (Note: the fields with # around the names are variables for this 3rd party app)
CREATE TABLE #temp (AttCount int, BegProd varchar(255))
SELECT COUNT(d.[#BegAttField#])-1 AS AttCount, d.[#BegAttField#] AS BegProd
INTO [#temp] FROM [Document] d
WHERE d.[#BegAttField#] IS NOT NULL GROUP BY [#BegAttField#]
UPDATE [Document] d SET d.[#NumAttach#] =
SELECT t.[AttCount] FROM [#temp] t INNER JOIN [Document] d1
WHERE t.[BegProd] = d1.[#BegAttField#]
DROP TABLE #temp
Unfortunately I'm running this script through a 3rd party database application that uses SQL as its back-end. So the errors I get are simply: "There is already an object named '#temp' in the database. Incorrect syntax near the keyword 'WHERE'. "
Comment out the CREATE TABLE statement. The SELECT INTO creates that #temp table.

SQL query select from table and group on other column

I'm phrasing the question title poorly as I'm not sure what to call what I'm trying to do but it really should be simple.
I've a link / join table with two ID columns. I want to run a check before saving new rows to the table.
The user can save attributes through a webpage but I need to check that the same combination doesn't exist before saving it. With one record it's easy as obviously you just check if that attributeId is already in the table, if it is don't allow them to save it again.
However, if the user chooses a combination of that attribute and another one then they should be allowed to save it.
Here's an image of what I mean:
So if a user now tried to save an attribute with ID of 1 it will stop them, but I need it to also stop them if they tried ID's of 1, 10 so long as both 1 and 10 had the same productAttributeId.
I'm confusing this in my explanation but I'm hoping the image will clarify what I need to do.
This should be simple so I presume I'm missing something.
If I understand the question properly, you want to prevent the combination of AttributeId and ProductAttributeId from being reused. If that's the case, simply make them a combined primary key, which is by nature UNIQUE.
If that's not feasible, create a stored procedure that runs a query against the join for instances of the AttributeId. If the query returns 0 instances, insert the row.
Here's some light code to present the idea (may need to be modified to work with your database):
SELECT COUNT(1) FROM MyJoinTable WHERE AttributeId = #RequestedID
IF ##ROWCOUNT = 0
BEGIN
INSERT INTO MyJoinTable ...
END
You can control your inserts via a stored procedure. My understanding is that
users can select a combination of Attributes, such as
just 1
1 and 10 together
1,4,5,10 (4 attributes)
These need to enter the table as a single "batch" against a (new?) productAttributeId
So if (1,10) was chosen, this needs to be blocked because 1-2 and 10-2 already exist.
What I suggest
The stored procedure should take the attributes as a single list, e.g. '1,2,3' (comma separated, no spaces, just integers)
You can then use a string splitting UDF or an inline XML trick (as shown below) to break it into rows of a derived table.
Test table
create table attrib (attributeid int, productattributeid int)
insert attrib select 1,1
insert attrib select 1,2
insert attrib select 10,2
Here I use a variable, but you can incorporate as a SP input param
declare #t nvarchar(max) set #t = '1,2,10'
select top(1)
t.productattributeid,
count(t.productattributeid) count_attrib,
count(*) over () count_input
from (select convert(xml,'<a>' + replace(#t,',','</a><a>') + '</a>') x) x
cross apply x.x.nodes('a') n(c)
cross apply (select n.c.value('.','int')) a(attributeid)
left join attrib t on t.attributeid = a.attributeid
group by t.productattributeid
order by countrows desc
Output
productattributeid count_attrib count_input
2 2 3
The 1st column gives you the productattributeid that has the most matches
The 2nd column gives you how many attributes were matched using the same productattributeid
The 3rd column is how many attributes exist in the input
If you compare the last 2 columns and the counts
match - you can use the productattributeid to attach to the product which has all these attributes
don't match - then you need to do an insert to create a new combination