采集项目数据处理详细步骤

Outshine 3月前 ⋅ 227 阅读

第一步:去除新表重复记录

DELETE FROM 表名 WHERE memberid IN (SELECT * FROM (SELECT memberid FROM 表名 GROUP BY memberid HAVING COUNT(memberid) > 1) t1) AND id NOT IN (SELECT * FROM (SELECT MIN(id) FROM 表名 GROUP BY memberid HAVING COUNT(memberid) > 1) t2)

第二步:备注栏标注 update 表名 set remark='new'

第三步:把新数据按网址类型插入新建的表: insert into alibaba_new_one(company_name,company_url,memberid,is_cxt,contactor,phone,mobile,fax,address,product,business_scope,business_model,legal_person,reg_no,reg_date,company_type,update_on,remark) SELECT company_name,company_url,memberid,is_cxt,contactor,phone,mobile,fax,address,product,business_scope,business_model,legal_person,reg_no,reg_date,company_type,update_on,remark FROM alibaba_new where company_url not LIKE 'https://www.1688.com%' ORDER BY id DESC

第四步:将原数据按ID递增方式插入新建的表 插入表一 insert into alibaba_new_one(company_name,company_url,memberid,is_cxt,contactor,phone,mobile,fax,address,product,business_scope,business_model,legal_person,reg_no,reg_date,company_type,update_on,remark) SELECT company_name,company_url,memberid,is_cxt,contactor,phone,mobile,fax,address,product,business_scope,business_model,legal_person,reg_no,reg_date,company_type,update_on,remark FROM alibaba_update ORDER BY id ASC 插入表二 insert into alibaba_new_two(company_name,company_url,memberid,is_cxt,contactor,phone,mobile,fax,address,product,business_scope,business_model,legal_person,reg_no,reg_date,company_type,update_on,remark) SELECT company_name,company_url,memberid,is_cxt,contactor,phone,mobile,fax,address,product,business_scope,business_model,legal_person,reg_no,reg_date,company_type,update_on,remark FROM alibaba_two ORDER BY id ASC

多表去重复

delete from alibaba_tmp where memberid in (select memberid from alibaba) OR memberid in (select memberid from alibaba_two) OR memberid in (select memberid from alibaba_nophone)


全部评论: 0

    我有话说: