missing variables in HRRR data accessing via THREDDS Data Server - thredds

I accessed hrrr data through thredds server as shown here. However, The two needed variables
"Upward long-wave radiation" and "Downward short-wave radiation" are not contained in the accessing dataset. With accessing using AWS, these variables are exist,
{'Best_4_layer_Lifted_Index_pressure_difference_layer',
'Categorical_freezing_rain_surface',
'Categorical_ice_pellets_surface',
'Categorical_rain_surface',
'Categorical_snow_surface',
'Composite_reflectivity_entire_atmosphere',
'Convective_available_potential_energy_pressure_difference_layer',
'Convective_available_potential_energy_surface',
'Convective_inhibition_pressure_difference_layer',
'Convective_inhibition_surface',
'Dewpoint_temperature_height_above_ground',
'Dewpoint_temperature_isobaric',
'Echo_top_cloud_tops',
'Geopotential_height_adiabatic_condensation_lifted',
'Geopotential_height_cloud_ceiling',
'Geopotential_height_cloud_tops',
'Geopotential_height_isobaric',
'Geopotential_height_surface',
'High_cloud_cover_high_cloud',
'Hourly_Maximum_of_Downward_Vertical_Velocity_in_the_lowest_400hPa_pressure_difference_layer_Mixed_intervals_Maximum',
'Hourly_Maximum_of_Simulated_Reflectivity_at_1_km_AGL_height_above_ground_Mixed_intervals_Maximum',
'Hourly_Maximum_of_Updraft_Helicity_over_Layer_2km_to_5_km_AGL_height_above_ground_layer_Mixed_intervals_Maximum',
'Hourly_Maximum_of_Upward_Vertical_Velocity_in_the_lowest_400hPa_pressure_difference_layer_Mixed_intervals_Maximum',
'Lightning_entire_atmosphere',
'Low_cloud_cover_low_cloud',
'Medium_cloud_cover_middle_cloud',
'Per_cent_frozen_precipitation_surface',
'Planetary_boundary_layer_height_surface',
'Precipitable_water_entire_atmosphere_single_layer',
'Pressure_of_level_from_which_parcel_was_lifted_pressure_difference_layer',
'Pressure_reduced_to_MSL_msl',
'Pressure_surface',
'Reflectivity_height_above_ground',
'Snow_depth_surface',
'Storm_relative_helicity_height_above_ground_layer',
'Surface_lifted_index_isobaric_layer',
'Temperature_height_above_ground',
'Temperature_isobaric',
'Total_cloud_cover_entire_atmosphere',
'Total_column_integrated_graupel_entire_atmosphere_single_layer_Mixed_intervals_Maximum',
'Total_precipitation_surface_1_Hour_Accumulation',
'Vertical_u-component_shear_height_above_ground_layer',
'Vertical_v-component_shear_height_above_ground_layer',
'Vertical_velocity_geometric_sigma_layer_Mixed_intervals_Average',
'Vertically_integrated_liquid_water_VIL_entire_atmosphere',
'Visibility_surface',
'Water_equivalent_of_accumulated_snow_depth_surface_1_Hour_Accumulation',
'Wind_speed_gust_surface',
'Wind_speed_height_above_ground_Mixed_intervals_Maximum',
'u-component_of_wind_height_above_ground',
'u-component_of_wind_isobaric',
'u-component_storm_motion_height_above_ground_layer',
'v-component_of_wind_height_above_ground',
'v-component_of_wind_isobaric',
'v-component_storm_motion_height_above_ground_layer'}

Related

Spark streaming writing as delta and checkpoint location

I am trying to stream from a delta table as a source and then also writing as delta after performing some transformations. so, this all worked. I recently looked at some videos and posts about best practices and found that I needed to do an additional thing and a modification.
The addition was adding queryName
Changing the checkpoint location, so that it resides alongside the data and not in a separate directory , like I was doing.
So, I have one question and a problem
Question is- can I add the queryName now, after my stream has been running for sometime , without any consequences?
and the problem, is: Now, that I have put my checkpoint location as the same directory as my delta table would be , I can't seem to create an external hive table anymore , it seems. It fails with
pyspark.sql.utils.AnalysisException: Cannot create table ('`spark_catalog`.`schemaname`.`tablename`'). The associated location ('abfss://refined#datalake.dfs.core.windows.net/curated/schemaname/tablename') is not empty but it's not a Delta table
So, this was my original code, which worked
def upsert(microbatchdf, batchId):
.....some transformations on microbatchdf
..........................
..........................
# Create Delta table beforehand as otherwise generated columns can't be created
# after having written the data into the data lake with the usual partionBy
deltaTable = (
DeltaTable.createIfNotExists(spark)
.tableName(f"{target_schema_name}.{target_table_name}")
.addColumns(microbatchdf_deduplicated.schema)
.addColumn(
"trade_date_year",
"INT",
generatedAlwaysAs="Year(trade_date) ",
)
.addColumn(
"trade_date_month",
"INT",
generatedAlwaysAs="MONTH(trade_date)",
)
.addColumn("trade_date_day", "INT", generatedAlwaysAs="DAY(trade_date)")
.partitionedBy("trade_date_year", "trade_date_month", "trade_date_day")
.location(
f"abfss://{target_table_location_filesystem}#{datalakename}.dfs.core.windows.net/{target_table_location_directory}"
)
.execute()
)
.....some transformations and writing to the delta table
#end
#this is how the stream is run
streamjob = (
spark.readStream.format("delta")
.table(f"{source_schema_name}.{source_table_name}")
.writeStream.format("delta")
.outputMode("append")
.foreachBatch(upsert)
.trigger(availableNow=True)
.option(
"checkpointLocation",
f"abfss://{target_table_location_filesystem}#{datalakename}.dfs.core.windows.net/curated/checkpoints/",
)
.start()
)
streamjob.awaitTermination()
Now, to this working piece , I only tried adding the queryName and modifying the checkpoint location (see comment for the modification and addition)
streamjob = (
spark.readStream.format("delta")
.table(f"{source_schema_name}.{source_table_name}")
.writeStream.format("delta")
.queryName(f"{source_schema_name}.{source_table_name}") # this added
.outputMode("append")
.foreachBatch(upsert)
.trigger(availableNow=True)
.option(
"checkpointLocation",
f"abfss://{target_table_location_filesystem}#{datalakename}.dfs.core.windows.net/{target_table_location_directory}/_checkpoint", # this changed
)
.start()
)
streamjob.awaitTermination()
In my datalake the _checkpoint did get created and apparently for this folder, the external table creation complains of non empty folder, whereas the documentation here, mentions that
So, why is the external hive table creation fails then? Also, please note my question about the queryName addition to an already running stream.
Point to note is- I have tried dropping the external table and also removed the contents of that directory, so there is nothing in that directory except the _checkpoint folder Which got created when I ran the streaming job , just before it got to creating the table inside the upsert method.
Any questions and I can help clarify.
The problem is that checkpoint files are put before you call the ``DeltaTable.createIfNotExists` function that checks if you have any data in that location or not, and fails because additional files are there, but they don't belong to the Delta Lake table.
If you want to keep checkpoint with your data, you need to put DeltaTable.createIfNotExists(spark)... outside of the upsert function - in this case, table will be created before any checkpoint files are created.

`spark.sql.warehouse.dir` is ignored if `enableHiveSupport()`

I encountered a weird situation where a location - hdfs://<host>/hive/warehouse - is used to hold data for spark managed tables.
It is a path out of nowhere; I have spark.sql.warehouse.dir in spark-default.conf set to hdfs://<host>/usr/<usr>/spark/warehouse, a totally different location, and hive's default warehouse location (metastore.warehouse.dir) is /user/hive/warehouse.
I am using a standalone Hive Metastore Service instead of a full-fledged Hive instance.
This situation happens only when enableHiveSuport().
If I remove .enableHiveSupport() part from SparkSession initialization code.
saveAsTable() does the expected, the data is put inside the path set at by spark.sql.warehouse.dir.

Can I save a trace file/extended events file to another partition other than the C drive on the server? Or another server altogether?

I've recently set some traces and extended events up and running in SQL on our new virtual server to show the access that users have to each database and whether they have logged in recently, and have set the file to save as a physical file on the server rather than writing to a SQL table to save resource. I've set the traces as jobs running at 8am each morning with a 12-hour delay so we can record as much information as possible.
Our IT department ideally don't want anything other than the OS on the C drive of the virtual server, so I'd like to be able to write the trace from my SQL script either to a different partition or to another server altogether.
I have attempted to insert a direct path to a different server within my code and have entered a different partition to C, however unless I write the trace/extended event files to the C drive I get an error message.
CREATE EVENT SESSION [LoginTraceTest] ON SERVER
ADD EVENT sqlserver.existing_connection(SET collect_database_name=
(1),collect_options_text=(1)
ACTION(package0.event_sequence,sqlos.task_time,sqlserver.client_pid,
sqlserver.database_id,sqlserver.
database_name,sqlserver.is_system,sqlserver.nt_username,sqlserver.request_id,sqlserver.server_principal_sid,sqlserver.session_id,sqlserver.session_nt_username,
sqlserver.sql_text,sqlserver.username)),
ADD EVENT sqlserver.login(SET collect_database_name=
(1),collect_options_text=(1)
ACTION(package0.event_sequence,sqlos.task_time,sqlserver.client_pid,sqlserver.database_id,sqlserver.database_name,sqlserver.is_system,sqlserver.nt_username,sqlserver.request_id,sqlserver.server_principal_sid,sqlserver.session_id,sqlserver.
session_nt_username,sqlserver.sql_text,sqlserver.username) )
ADD TARGET package0.asynchronous_file_target (
SET FILENAME = N'\\SERVER1\testFolder\LoginTrace.xel',
METADATAFILE = N'\\SERVER1\testFolder\LoginTrace.xem' );
The error I receive is this:
Msg 25641, Level 16, State 0, Line 6
For target, "package0.asynchronous_file_target", the parameter "filename" passed is invalid. Target parameter at index 0 is invalid
If I change it to another partition rather than a different server:
SET FILENAME = N'D:\Traces\LoginTrace\LoginTrace.xel',
METADATAFILE = N'D:\Traces\LoginTrace\LoginTrace.xem' );
SQL server states that the command completed successfully, but the file isn't written to the partition.
Any ideas please as to what I can do to write the files to another partition or server?

Importing data from multi-value D3 database into SQL issues

Trying to use the mv.NET by bluefinity tools. Made some integration packages with it for importing data from a d3 multi-value database into MS SQL 2012 but seem to be having some trouble with the mapping.
For the VOYAGES table have some commentX fields in the D3 application that are acting quite unwieldy and the INSERT fails after a certain number of rows with the following message
>Error: 0xC0047062 at INSERT, mvNET Source[354]: System.Exception: Error #8: dataReader[0] = LTPAC002 ci.BufferColumnIndex = 52, ci.ColumnName = COMMGROUP(Error #8: dataReader[0] = LTPAC002 ci.BufferColumnIndex = 52, ci.ColumnName = COMMGROUP(The value is too large to fit in the column data area of the buffer.))
at mvNETDataSource.mvNETSource.PrimeOutput(Int32 outputs, Int32[] outputIDs, PipelineBuffer[] buffers)
at Microsoft.SqlServer.Dts.Pipeline.ManagedComponentHost.HostPrimeOutput(IDTSManagedComponentWrapper100 wrapper, Int32 outputs, Int32[] outputIDs, IDTSBuffer100[] buffers, IntPtr ppBufferWirePacket)
Error: 0xC0047038 at INSERT, SSIS.Pipeline: SSIS Error Code DTS_E_PRIMEOUTPUTFAILED.The PrimeOutput method on mvNET Source returned error code 0x80131500.The component returned a failure code when the pipeline engine called PrimeOutput().The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing.There may be error messages posted before this with more information about the failure.
The value is too large to fit in the column data area of the buffer. -> tried changing the input / outputs types but can't seem to get it right.
In the SQL table the columns are of type ntext.
In the .dtsx job the data type for the columns are of type Unicode String [DT_WSTR] with length 4000 , I guess these are auto-detected.
The import worked for other D3 files like this not sure why it fails for these comment fields.
Running the query on the mv.NET Data Manager ( on the d3 server) times out after 240 seconds so maybe this is the underlying issue?
Any ideas how to proceed? Thank you ~
Most like reason is column COMMGROUP does not have correct data type or some record in source do not fit in output type
To find error record (causing) you have to use on redirect row (property of component failing component ) and get the result set in some txt.csv /or tsv file .
then check data
The exception is being thrown from mv.NET so I suggest you call (or ask your reseller) to call Bluefinity support and ask them about this. You're paying for support, might as well use it. Those programs shouldn't be allowed to throw exceptions like that.
D3 doesn't export Unicode, that might be one issue. But if the Data Manager times-out then I suspect something is wrong in the connectivity into D3. Open a Connection Monitor from the Session Monitor and watch the connection when you make the request. I'm guessing it's either hanging or more probably it's falling into BASIC Debug.
Make sure all D3-side programs related to this are either all Flash-compiled, or all Not Flashed. Your app code will fall into Debug if it's not Flashed but MVNET.BP is.
If it's your program that's in Debug, fix it. If you're not sure which program it is, LIST-RUNTIME-ERRORS in DM.
If it's a MVNET.BP program, again work with Bluefinity. If you are using MVSP for connectivity then the Connection Monitor may be useless, you'll need to change that to an IP (Telnet) connection to see the raw data exchange.

X-Cart - SQL error notification (Error code : 1030)

I am working on xcart website for my company. Right now, I always get the error messages from my website http://mothersenvogue.com.kh/ as below:
[24-May-2015 08:50:51] (shop: 24-May-2015 15:50:51) SQL error:
Site : https://mothersenvogue.com.kh
Remote IP : 176.9.29.209
Logged as :
SQL query : SHOW FIELDS FROM xcart_session_history
Error code : 103
Description : Got error 28 from storage engine
Request URI: /secure_login.php?xid=025530538a738ddc86617a9aa81bc990
Backtrace:
/home/www/mothersenvogue.com.kh/include/func/func.db.php:189
/home/www/mothersenvogue.com.kh/include/func/func.db.php:115
/home/www/mothersenvogue.com.kh/include/func/func.db.php:384
/home/www/mothersenvogue.com.kh/include/func/func.db.php:630
/home/www/mothersenvogue.com.kh/include/func/func.db.php:458
/home/www/mothersenvogue.com.kh/include/sessions.php:161
/home/www/mothersenvogue.com.kh/init.php:524
/home/www/mothersenvogue.com.kh/preauth.php:51
/home/www/mothersenvogue.com.kh/auth.php:45
/home/www/mothersenvogue.com.kh/secure_login.php:37
-------------------------------------------------
Many error messages are from func.db.php, init.php, preauth.php, auth.php all at the same line number, and on the same SQL Query "SHOW" statement.
I tried to check all the above files at their given line number but I could not find anything wrong.
Pleasse kindly help advise me what is wrong with it? is it something wrong inside these files as I got many error messages sent to me by email with the similar content like above.
I was refered to here from my previous question in xcart forum, and here is my question there:
https://bt.x-cart.com/view.php?id=44717
Many thanks.
In most cases the file storage (the drive where your files are located on the hosting server, where you checked the free space) is physically located on a different virtual / physical server / drive. Most hosting companies use an optimized dedicated servers for MySQL.
Thus you see enough space in your account, but MySQL still reports that there is no space left on the drive (where MySQL server is currently running).
Thus the best way is to contact the hosting provider and find out what's the situation with the disk space on that very machine, where MySQL is running.