Syntax for wildcard SQL Server Merge statement - sql

This SQL Server MERGE statement works, but it is clumsy. Is there any syntax to merge these two tables so that they have the exact same structure? I am trying to update the Score_date from the Score_Import table. I have many tables to do and do not want to type them out. Thanks.
MERGE INTO [Score_Data].[dbo].Product as Dp
USING [Score_import].[dbo].Product as Ip
ON Dp.part_no = Ip.part_no
WHEN MATCHED THEN
UPDATE
SET Dp.total = Ip.total
,Dp.description = Ip.description
,Dp.family = Ip.family
,DP.um = IP.um
,DP.new_part_no = IP.new_part_no
,DP.prod_code = IP.prod_code
,DP.sub1 = IP.sub1
,DP.sub2 = IP.sub2
,DP.ven_no = IP.ven_no
,DP.no_sell = IP.no_sell
,DP.rp_dns = IP.rp_dns
,DP.nfa = IP.nfa
,DP.loc = IP.loc
,DP.cat_desc = IP.cat_desc
,DP.cat_color = IP.cat_color
,DP.cat_size = IP.cat_size
,DP.cat_fits = IP.cat_fits
,DP.cat_brand = IP.cat_brand
,DP.cat_usd1 = IP.cat_usd1
,DP.cat_usd2 = IP.cat_usd2
,DP.cat_usd3 = IP.cat_usd3
,DP.cat_usd4 = IP.cat_usd4
,DP.cat_usd5 = IP.cat_usd5
,DP.cat_usd6 = IP.cat_usd6
,DP.cat_usd7 = IP.cat_usd7
,DP.cat_usd8 = IP.cat_usd8
,DP.cat_usd9 = IP.cat_usd9
,DP.cat_usd10 = IP.cat_usd10
,DP.cat_usd11 = IP.cat_usd11
,DP.cat_usd12 = IP.cat_usd12
,DP.cat_usd13 = IP.cat_usd13
,DP.cat_usd14 = IP.cat_usd14
,DP.cat_usd15 = IP.cat_usd15
,DP.buy = IP.buy
,DP.price_1 = IP.price_1
,DP.price_2 = IP.price_2
,DP.price_3 = IP.price_3
,DP.price_4 = IP.price_4
,DP.price_5 = IP.price_5
,DP.price_6 = IP.price_6
,DP.price_7 = IP.price_7
,DP.price_8 = IP.price_8
,DP.price_9 = IP.price_9
,DP.create_date = IP.create_date
,DP.barcode = IP.barcode
,DP.check_digit = IP.check_digit
,DP.supplier = IP.supplier
,DP.prc_fam_code = DP.prc_fam_code
,DP.note = IP.note
,DP.mfg_part_no = IP.mfg_part_no
,DP.special = IP.special
,DP.spc_price = IP.spc_price
,DP.firm = IP.firm
,DP.box = IP.box
,DP.no_split = IP.no_split
,DP.drop_ship = IP.drop_ship
,DP.case_pack = IP.case_pack
,DP.inner_pack = IP.inner_pack
WHEN NOT MATCHED BY TARGET THEN
INSERT (part_no
,description
,family
,Total
,um
,new_part_no
,prod_code
,sub1
,sub2
,ven_no
,no_sell
,rp_dns
,nfa
,loc
,cat_desc
,cat_color
,cat_size
,cat_fits
,cat_brand
,cat_usd1
,cat_usd2
,cat_usd3
,cat_usd4
,cat_usd5
,cat_usd6
,cat_usd7
,cat_usd8
,cat_usd9
,cat_usd10
,cat_usd11
,cat_usd12
,cat_usd13
,cat_usd14
,cat_usd15
,buy
,price_1
,price_2
,price_3
,price_4
,price_5
,price_6,
,price_7,
,price_8
,price_9
,create_date
,barcode
,check_digit
,supplier
,prc_fam_code
,note
,mfg_part_no
,special
,spc_price
,firm
,box
,no_split
,drop_ship
,case_pack
,inner_pack)
VALUES
(Ip.Part_no
,Ip.description
,Ip.family
,Ip.Total
,Ip.um
,Ip.new_part_no
,Ip.prod_code
,Ip.sub1
,Ip.sub2
,Ip.ven_no
,Ip.no_sell
,Ip.rp_dns
,Ip.nfa
,Ip.loc
,Ip.cat_desc
,Ip.cat_color
,Ip.cat_size
,Ip.cat_fits
,Ip.cat_brand
,Ip.cat_usd1
,Ip.cat_usd2
,Ip.cat_usd3
,Ip.cat_usd4
,Ip.cat_usd5
,Ip.cat_usd6
,Ip.cat_usd7
,Ip.cat_usd8
,Ip.cat_usd9
,Ip.cat_usd10
,Ip.cat_usd11
,Ip.cat_usd12
,Ip.cat_usd13
,Ip.cat_usd14
,Ip.cat_usd15
,Ip.buy
,Ip.price_1
,Ip.price_2
,Ip.price_3
,Ip.price_4
,Ip.price_5
,Ip.price_6
,Ip.price_7
,Ip.price_8
,Ip.price_9
,Ip.create_date
,Ip.barcode
,Ip.check_digit
,Ip.supplier
,Ip.prc_fam_code
,Ip.note
,Ip.mfg_part_no
,Ip.special
,Ip.spc_price
,Ip.firm
,Ip.box
,Ip.no_split
,Ip.drop_ship
,Ip.case_pack
,Ip.inner_pack)
WHEN NOT MATCHED BY SOURCE THEN
DELETE
OUTPUT $action, Inserted.*, Deleted.*;

Related

Calculating means for Columns based on data in another data set

I have two data sets, lets call them A and B (dput of the first 5 rows of each below):
`A: structure(list(Location = c(3960.82823, 3923.691, 3919.40593,
3907.97909, 3886.55377), Height = c(0.163744751, 0.231555472,
0.232150996, 0.192475738, 0.162966924), Start = c(3963.68494,
3946.54468, 3920.83429, 3909.40745, 3895.1239), End = c(3953.68645,
3920.83429, 3909.40745, 3895.1239, 3883.69706)), row.names = c(NA,
5L), class = "data.frame")
`
`B:structure(list(Wavenumber..cm.1. = c(3997.96546, 3996.5371, 3995.10875,
3993.68039, 3992.25204), M100 = c(0.00106, 0.00105, 0.00095,
0.00075, 0.00053), M101 = c(0.00081, 0.00092, 0.00102, 0.001,
0.00082), M102 = c(0.00099, 0.00109, 0.00105, 9e-04, 0.00072),
M103 = c(0.00101, 0.00111, 0.0012, 0.00129, 0.00133), M104 = c(0.00081,
0.00083, 0.00084, 0.00086, 0.00089), M105 = c(0.00139, 0.00113,
0.00092, 0.00089, 0.00102), M106 = c(0.00095, 0.00103, 0.00095,
0.00074, 0.00058), M107 = c(0.00054, 0.00058, 0.00059, 0.00049,
0.00032), M108 = c(0.00042, 5e-04, 5e-04, 0.00034, 0.00011
), M109 = c(0.00069, 0.00051, 0.00043, 0.00051, 0.00065),
M110 = c(0.00113, 0.00121, 0.00124, 0.00116, 0.00099), M111 = c(0.00039,
0.00056, 0.00068, 0.00068, 0.00056), M112 = c(0.0011, 0.00112,
0.00112, 0.00108, 0.00099), M113 = c(3e-04, 3e-04, 3e-04,
0.00027, 0.00019), M114 = c(0.00029, 6e-05, -2e-05, 9e-05,
0.00028), M115 = c(0.00091, 0.00079, 0.00061, 0.00038, 2e-04
), M116 = c(0.00117, 0.00105, 0.00096, 0.00092, 0.00092),
M117 = c(0.00039, 2e-04, 6e-05, 6e-05, 0.00018), M118 = c(0.00096,
0.00073, 0.00055, 0.00047, 0.00049), M119 = c(0.00037, 0.00031,
0.00024, 0.00018, 0.00018), M120 = c(0.00116, 0.00098, 0.00084,
0.00076, 0.00067), M121 = c(0.00039, 0.00024, 0.00011, 7e-05,
0.00011), M122 = c(0.00032, 0.00038, 0.00045, 0.00044, 0.00035
), M123 = c(9e-04, 0.00097, 0.00108, 0.0012, 0.00128), M124 = c(-0.00082,
-0.00065, -0.00049, -0.00037, -0.00036), M125 = c(0.00053,
0.00054, 0.00055, 6e-04, 0.00071), M126 = c(7e-05, 0.00022,
0.00022, 0.00011, 2e-05), M127 = c(0.00086, 9e-04, 0.00086,
0.00073, 0.00058), M128 = c(0.00089, 0.00078, 0.00069, 0.00057,
0.00043), M129 = c(0.00094, 0.00097, 0.00106, 0.00114, 0.00105
), M130 = c(0.0013, 0.00118, 0.00115, 0.00116, 0.00111),
M131 = c(0.00029, 0.00033, 0.00033, 3e-04, 0.00022), M132 = c(0,
0.00026, 0.00048, 6e-04, 0.00063), M133 = c(3e-05, -6e-05,
-6e-05, 5e-05, 0.00019), M134 = c(0.00056, 0.00054, 0.00052,
0.00054, 0.00057), M135 = c(2e-05, -4e-05, 6e-05, 0.00031,
0.00057), M136 = c(0.00083, 0.00075, 0.00068, 0.00068, 0.00073
), M137 = c(0.00064, 0.00074, 0.00084, 0.00095, 0.00105),
M139 = c(0.00044, 0.00044, 0.00042, 0.00043, 0.00047), M140 = c(0.00138,
0.00113, 0.00102, 0.0011, 0.00121), M141 = c(0.00062, 0.00043,
2e-04, 2e-05, 0), M142 = c(-0.00022, -0.00017, -0.00014,
-1e-04, 0), M143 = c(0.00109, 0.00108, 0.00103, 0.00093,
0.00087), M144 = c(0.00104, 0.00116, 0.00117, 0.00105, 0.00085
), M145 = c(7e-04, 0.00096, 0.00109, 0.00098, 0.00069), M146 = c(0.0014,
0.00158, 0.00165, 0.00154, 0.0013), M147 = c(6e-04, 0.00071,
0.00075, 0.00072, 0.00065), M148 = c(0.00098, 0.00093, 0.00091,
9e-04, 0.00088), M149 = c(0.00055, 0.00058, 0.00054, 0.00037,
0.00017), M150 = c(7e-04, 0.00068, 8e-04, 0.00107, 0.00132
), M151 = c(0.00037, 0.00042, 0.00046, 0.00047, 0.00046),
M152 = c(0.00047, 0.00042, 0.00043, 0.00045, 0.00045), M153 = c(0.00095,
0.00088, 0.00083, 8e-04, 0.00072), M154 = c(6e-05, 0.00013,
0.00032, 0.00054, 0.00062), M155 = c(0.00061, 0.00057, 0.00043,
0.00022, 4e-05), M156 = c(0.00077, 0.00078, 0.00071, 0.00052,
0.00025), M157 = c(0.00088, 0.00078, 0.00069, 0.00063, 0.00058
), M158 = c(0.00091, 0.00085, 0.00082, 0.00081, 8e-04), M159 = c(0.00078,
0.00076, 0.00073, 0.00074, 0.00079), M160 = c(0.00068, 7e-04,
0.00075, 8e-04, 0.00079), M161 = c(0.00055, 0.00073, 0.00082,
0.00085, 9e-04), M162 = c(0.00104, 0.00111, 0.0011, 0.00104,
0.00102), M163 = c(0.00076, 0.00071, 0.00069, 0.00068, 0.00067
), M164 = c(0.0012, 0.00133, 0.00154, 0.00174, 0.00177),
M165 = c(0.00072, 0.00073, 0.00072, 0.00074, 0.00083), M166 = c(0.00067,
0.00055, 0.00035, 0.00012, -2e-05), M167 = c(0.00068, 0.00053,
0.00047, 0.00051, 0.00059), M168 = c(0.00067, 0.00092, 0.001,
0.00087, 0.00067), M169 = c(0.00124, 0.00107, 0.00101, 0.00108,
0.00118), M170 = c(0.00054, 0.00064, 0.00069, 0.00066, 0.00053
), M171 = c(0.00029, 3e-04, 3e-04, 0.00031, 3e-04), M172 = c(0.00085,
0.00091, 0.00082, 0.00063, 0.00052), M173 = c(0.00022, 0.00036,
0.00053, 0.00061, 0.00056), M174 = c(5e-04, 0.00031, 0.00021,
0.00023, 0.00031), M175 = c(0.00074, 0.00066, 0.00059, 0.00051,
0.00043), M176 = c(9e-04, 0.00062, 0.00044, 0.00039, 0.00039
), M177 = c(0.00045, 0.00038, 0.00033, 0.00035, 0.00043),
M178 = c(0.00075, 0.00092, 0.00097, 0.00086, 0.00067), M179 = c(0.00047,
0.00033, 0.00026, 3e-04, 0.00037), M180 = c(0.00083, 0.00077,
0.00074, 0.00074, 7e-04), M181 = c(0.0013, 0.00138, 0.00137,
0.00127, 0.00109), M182 = c(0.00062, 0.00049, 0.00043, 0.00042,
0.00038), M183 = c(0.00056, 4e-04, 0.00034, 0.00046, 0.00065
), M184 = c(0.00122, 0.00116, 0.00096, 0.00067, 0.00039),
M185 = c(0.00045, 0.00026, 0.00012, 1e-04, 0.00024), M187 = c(0.00078,
0.00038, 8e-05, 0, 0.00014)), row.names = c(NA, 5L), class = "data.frame")
`
I want to be able to calculate the means of the M columns in data set B, based on the Start and End columns in data set A (which correspond to the Wavenumber cm-1 column in data set B). So that for each Start and End set of values you have a corresponding mean for each M column in data set B.
So for example for the Start and End values in the first row of data set A:
Start: 3963.68494 End: 3953.68645 you would calculate the mean of each M column in data set B using the absorbance values corresponding to the Wavenumber cm-1 range of 3963.6849 to 3953.68645, which would then be stored in a separate data frame (with all the M column names) called meanData or something.
I can quite figure out how to write a function/loop that would do that, going and taking the Start and End values in dataset A, looking at dataset B getting the corresponding Absorbance values that fall into that Start and End range, calculate their mean and write it into a new data frame under its corresponding M column name and repeating this for each row of Start and End Values in dataset A. I know you would likely do it with an index, but I'm not sure how to write it exactly. Any help would be very much appreciated!
I tried creating different indexes for the Start and End columns and using them to try and specify the values I want in dataset B, using [] but I was unsuccessful:
`test<-mean(B$M100[which(B$Wavenumber..cm.1.[index2[i] to B$Wavenumber..cm.1.index3[i]])`
where index2 is the Start values in dataset A and index3 is the end values in datasetA, this did not work

PLOTLY tracegroupgap

Where do i insert tracegroupgap in this code? I have tried in legend=(dict.. without success.
layout = go.Layout(
title = 'IGS',
xaxis = dict(title = "Point", tickmode='linear', tick0=0, dtick=10),
yaxis = dict(title = "sp, mv"),
hovermode = 'closest',
legend = dict(font=dict(family="Courier",
size=10,
color="black"),
*tracegroupgap = 5*
)
)

Spacy v3 - ValueError: [E030] Sentence boundaries unset

I'm training an entity linker model with spacy 3, and am getting the following error when running spacy train:
ValueError: [E030] Sentence boundaries unset. You can add the 'sentencizer' component to the pipeline with: nlp.add_pipe('sentencizer'). Alternatively, add the dependency parser or sentence recognizer, or set sentence boundaries by setting doc[i].is_sent_start. .
I've tried with both transformer and tok2vec pipelines, it seems to be failing on this line:
File "/usr/local/lib/python3.7/dist-packages/spacy/pipeline/entity_linker.py", line 252, in update sentences = [s for s in eg.reference.sents]
Running spacy debug data shows no errors.
I'm using the following config, before filling it in with spacy init fill-config:
[paths]
train = null
dev = null
kb = "./kb"
[system]
gpu_allocator = "pytorch"
[nlp]
lang = "en"
pipeline = ["transformer","parser","sentencizer","ner", "entity_linker"]
batch_size = 128
[components]
[components.transformer]
factory = "transformer"
[components.transformer.model]
#architectures = "spacy-transformers.TransformerModel.v3"
name = "roberta-base"
tokenizer_config = {"use_fast": true}
[components.transformer.model.get_spans]
#span_getters = "spacy-transformers.strided_spans.v1"
window = 128
stride = 96
[components.sentencizer]
factory = "sentencizer"
punct_chars = null
[components.entity_linker]
factory = "entity_linker"
entity_vector_length = 64
get_candidates = {"#misc":"spacy.CandidateGenerator.v1"}
incl_context = true
incl_prior = true
labels_discard = []
[components.entity_linker.model]
#architectures = "spacy.EntityLinker.v1"
nO = null
[components.entity_linker.model.tok2vec]
#architectures = "spacy.HashEmbedCNN.v1"
pretrained_vectors = null
width = 96
depth = 2
embed_size = 2000
window_size = 1
maxout_pieces = 3
subword_features = true
[components.parser]
factory = "parser"
[components.parser.model]
#architectures = "spacy.TransitionBasedParser.v2"
state_type = "parser"
extra_state_tokens = false
hidden_width = 128
maxout_pieces = 3
use_upper = false
nO = null
[components.parser.model.tok2vec]
#architectures = "spacy-transformers.TransformerListener.v1"
grad_factor = 1.0
[components.parser.model.tok2vec.pooling]
#layers = "reduce_mean.v1"
[components.ner]
factory = "ner"
[components.ner.model]
#architectures = "spacy.TransitionBasedParser.v2"
state_type = "ner"
extra_state_tokens = false
hidden_width = 64
maxout_pieces = 2
use_upper = false
nO = null
[components.ner.model.tok2vec]
#architectures = "spacy-transformers.TransformerListener.v1"
grad_factor = 1.0
[components.ner.model.tok2vec.pooling]
#layers = "reduce_mean.v1"
[corpora]
[corpora.train]
#readers = "spacy.Corpus.v1"
path = ${paths.train}
max_length = 0
[corpora.dev]
#readers = "spacy.Corpus.v1"
path = ${paths.dev}
max_length = 0
[training]
accumulate_gradient = 3
dev_corpus = "corpora.dev"
train_corpus = "corpora.train"
[training.optimizer]
#optimizers = "Adam.v1"
[training.optimizer.learn_rate]
#schedules = "warmup_linear.v1"
warmup_steps = 250
total_steps = 20000
initial_rate = 5e-5
[training.batcher]
#batchers = "spacy.batch_by_padded.v1"
discard_oversize = true
size = 2000
buffer = 256
[initialize]
vectors = ${paths.vectors}
[initialize.components]
[initialize.components.sentencizer]
[initialize.components.entity_linker]
[initialize.components.entity_linker.kb_loader]
#misc = "spacy.KBFromFile.v1"
kb_path = ${paths.kb}
I can write a script to add the sentence boundaries in manually to the docs, but am wondering why the sentencizer component is not doing this for me, is there something missing in the config?
You haven't put the sentencizer in annotating_components, so the updates it makes aren't visible to other components during training. Take a look at the relevant section in the docs.

Appending tables generated from a loop

I am a new python user here and am trying to append data together that I have pulled from a pdf using Camelot but am having trouble getting them to join together.
Here is my code:
url = 'https://www.fhfa.gov/DataTools/Downloads/Documents/HPI/HPI_AT_Tables.pdf'
tables = camelot.read_pdf(url,flavor='stream', edge_tol = 500, pages = '1-end')
i = 0
while i in range(0,tables.n):
header = tables[i].df.index[tables[i].df.iloc[:,0]=='Metropolitan Statistical Area'].to_list()
header = str(header)[1:-1]
header = (int(header))
tables[i].df = tables[i].df.rename(columns = tables[i].df.iloc[header])
tables[i].df = tables[i].df.drop(columns = {'': 'Blank'})
print(tables[i].df)
#appended_data.append(tables[i].df)
#if i > 0:
# dfs = tables[i-1].append(tables[i], ignore_index = True)
#pass
i = i + 1
any help would be much appreciated
You can use pandas.concat() to concat a list of dataframe.
while i in range(0,tables.n):
header = tables[i].df.index[tables[i].df.iloc[:,0]=='Metropolitan Statistical Area'].to_list()
header = str(header)[1:-1]
header = (int(header))
tables[i].df = tables[i].df.rename(columns = tables[i].df.iloc[header])
tables[i].df = tables[i].df.drop(columns = {'': 'Blank'})
df_ = pd.concat([table.df for table in tables])

Generating a value in a step function based on a variable

Im creating an optimization model using gurobi and have some trouble with one of my constraints. The constraint is used to establish the quantity and is based on supply and demand curves. The supply curves cause the problems as it is a step curve. As seen in the code, the problem is when im writing the def MC section.
Demand_Curve1_const = 250
Demand_Curve1_slope = -0.025
MC_water = 0
MC_gas = 80
MC_coal = 100
CAP_water = 5000
CAP_gas = 2500
CAP_coal = 2000
model = pyo.ConcreteModel()
model.Const_P1 = pyo.Param(initialize = Demand_Curve1_const)
model.slope_P1 = pyo.Param(initialize = Demand_Curve1_slope)
model.MCW = pyo.Param(initialize = MC_water)
model.MCG = pyo.Param(initialize = MC_gas)
model.MCC = pyo.Param(initialize = MC_coal)
model.CW = pyo.Param(initialize = CAP_water)
model.CG = pyo.Param(initialize = CAP_gas)
model.CC = pyo.Param(initialize = CAP_coal)
model.qw = pyo.Var(within = pyo.NonNegativeReals)
model.qg = pyo.Var(within = pyo.NonNegativeReals)
model.qc = pyo.Var(within = pyo.NonNegativeReals)
model.d = pyo.Var(within = pyo.NonNegativeReals)
def MC():
if model.d <=5000:
return model.MCW
if model.d >= 5000 and model.d <= 7500:
return model.MCG
if model.d >= 7500 :
return model.MCC
def Objective(model):
return(model.Const_P1*model.d + model.slope_P1*model.d*model.d - (model.MCW*model.qw + model.MCG*model.qg + model.MCC*model.qc))
model.OBJ = pyo.Objective(rule = Objective, sense = pyo.maximize)
def P1inflow(model):
return(MC == model.Const_P1+model.slope_P1*model.d*2)
model.C1 = pyo.Constraint(rule = P1inflow)
Your function MC as stated would make the model nonlinear, and in a rather nasty way (discontinuous).
Piecewise linear functions are often modeled through binary variables or SOS2 sets (Special Ordered Sets of type 2). As you are using Pyomo, you can also use a tool that can generate MIP formulations automatically for you. See help(Piecewise).
An example that fits your description is here.