GATEIO LYM USDT

GATEIO INTERZIS ÎN CANADA

BALDUR'S GATE DARK ALLIANCE II DESCĂRCARE PC

COPY TRADING PE GATE.IO

POARTA.IO SI KRAKEN

Waiting for new input file ...

Restart

DEPUNEREA GATEIO RIPPLE ESTE DEZACTIVATĂ

Deduplicate one file:
- export an EndNote database into a file in RIS format
- upload this file in DedupEndNote
- save the results file with deduplicated records
- import this results file into a new EndNote database
Deduplicate a new file against an existing file / EndNote database: see /twofiles

ACUM 2 ZILE

Deduplication in EndNote misses many duplicate records. Building and maintaining a Journals List within Endnote can partly solve this problem, but there remain lots of cases where EndNote is too unforgiving when comparing records. Some bibliographic databases offer deduplication for their own databases (OVID: Medline and EMBASE), but this does not help PubMed, Cochrane or Web of Science users.

DedupEndNote deduplicates an EndNote RIS file and writes a new RIS file with the unique records, which can be imported into a new EndNote database. It is more forgiving than EndNote itself when comparing records, but tests have shown that it identifies many more duplicates (see below under "Test results").

The program has been tested on EndNote databases with records from:

CINAHL (EBSCOHost)
Cochrane Library (Trials)
EMBASE (OVID)
Medline (OVID)
PsycINFO (OVID)
PubMed
Scopus
Web of Science (very few tests with conference papers)

GATEIO NOI CETĂŢENI

1. Deduplicate records in a RIS file

Each pair of records is compared in 5 different ways. The general rule is:

Comparison	Result	Action
1 ... 5	YES	go to next comparison if present, else mark the records as duplicates
	(insufficient data for comparison)
	NO	stop comparisons for this pair of record

The following comparisons are used (in this order, chosen for performance reasons):

Publication year: Are they at most 1 year apart?
Insufficient data: Records without a publication year are compared to all records unless they have been identified as a duplicate.
Special cases: Cochrane Reviews are compared for the same publication year
Starting page or DOI: Are they the same?
If the starting pages are different or one or both are absent, the DOIs are compared.
Preprocessing: Article number is treated as a starting page if starting page itself is empty or contains "-".
Preprocessing: Starting pages are compared only for number: "S123" and "123" are considered the same.
Preprocessing: In DOIs 'http://dx.doi.org/', 'http://doi.org/', ... are left out.
Insufficient data: If one or both DOIs are missing and one or both of the starting pages are missing, the answer is YES. This is important because of PubMed ahead of print publications.
Special cases: For Cochrane Reviews DOIs are compared before starting pages.
Authors: Is the Jaro-Winkler similarity of the authors > 0.67?
Preprocessing: The author "Anonymous," and all Author Groups are skipped.
Preprocessing: First names are reduced to initials ("Moorthy, Ranjith K." to "Moorthy, RK").
Preprocessing: All authors from each record are joined by "; ".
Insufficient data: If one or both records have no authors, the answer is YES (except if one of the records is a reply (see below) and one of the records has no starting page or DOI).
Title: Is the Jaro-Winkler similarity of (one of) the normalized titles > 0.9?
The fields Original publication (OP), Short Title (ST), Title (TI) and sometimes Book section (T3, see below) are treated as titles.
Because the Jaro-Winkler similarity algorithm puts a heavy penalty on differences at the beginning of a string, the normalized titles are also reversed.
Preprocessing: The titles are normalized (converted to lower case, text between "<...>" removed, all characters which are not letters or numbers are replaced by a space character, ...).
Insufficient data: If one of the records is a reply (see below), the titles are not compared / the answer is YES (but the Jaro-Winkler similarity of the authors should be > 0.75 and the comparison between the journals is more strict).
ISSN or Journal: Are they the same (ISSN) or similar (Journal)?
The fields Journal / Book Title (T2), Alternate Journal (J2) and sometimes Book section (T3, see below) are treated as journals, ISBNs are treated as ISSNs. All ISSns and journal titles (including abbreviations) in the records are used.
If the ISSns are different or one or both records have no ISSN, the journals are compared.
Abbreviated and full journal titles are compared in a sensible way (see examples below).
Preprocessing: ISSNs are normalized (dashes are removed, lowercased). For ISBN-10 the first 9 digits are used, for ISBN-13 the 9 digits starting at position 4.
Preprocessing: Journal titles of the form "Zhonghua wai ke za zhi [Chinese journal of surgery]" or "Zhonghua wei chang wai ke za zhi = Chinese journal of gastrointestinal surgery" or "The Canadian Journal of Neurological Sciences / Le Journal Canadien Des Sciences Neurologiques" are split into 2 journal titles.
Preprocessing: the journal titles are normalized (hyphens, dots and apostrophes are replaced with space, end part between round or square brackets is removed, initial article is removed, ...).

T3 field: Especially EMBASE (OVID) uses this field for (1) Conference title (majority of cases), (2) an alternative journal title, and (3) original (non English) title. Case 1 (identified as containing a number or "Annual", "Conference", "Congress", "Meeting" or "Society") is skipped. All other T3 fields are treated as Journals and as titles.

Reply: a publication is considered a reply if the title (field TI) contains "reply", or contains "author(...)respon(...)", or is nothing but "response" (all case insensitive).

If two records get 5 YES answers, they are considered duplicates.

2. Enrich the deduplicated records

Only the first record of a set of duplicate records is copied to the output file.

When writing the output file, the following fields will be changed:

Author (AU):
- if the (only) author is "Anonymous", the author is omitted
DOI (DO):
- the DOIs of the removed duplicate records are copied to the saved record and deduplicated. The DOI field is important for finding the full text in EndNote.
- DOIs of the form "10.1038/ctg.2014.12", "http://dx.doi.org/10.1038/ctg.2014.12", ... are rewritten in the prescribed form "https://doi.org/10.1038/ctg.2014.12". DOIs of this form are clickable links in EndNote.
Publication year (PY):
- if the saved record has no value for its Publication year but one of the removed duplicate records has, the first not empty Publication year of the duplicates is copied to the saved record.
Starting page (SP) and Article Number (C7):
- the article number is put in the Pages field (SP) if the Pages field is empty or does not contain a "-", overwriting the Pages field content.
- the article number field (C7) is omitted
- if the saved record has no value for its Pages field (e.g. PubMed ahead of print publications) but one of the removed duplicate records has, the first not empty pages of the duplicates are copied to the saved record.
- the Pages field gets an unabbreviated form: e.g. "482-91" is rewritten as "482-491".
- if the ending page is the same as the starting page, only the starting page is written ("192" instead of "192-192").
- for Cochrane Reviews a missing review number ("CD...") is extracted from the DOI.
Title (TI):
- If the publication is a reply, the title is replaced with the longest title from the duplicates (e.g. "Reply from the authors" is replaced by "Coagulation parameters and portal vein thrombosis in cirrhosis Reply")

The output file is a new RIS file which can be imported into a new EndNote database.

DedupEndNote is slower than EndNote in deduplicating records because its comparisons are more time consuming. EndNote can deduplicate a EndNote database of ca. 15,000 records in less dan 5 seconds. DedupEndNote needs around 20 seconds to deduplicate the export file in RIS format (115MB).

DedupEndNote has borrowed several ideas from: Yu Jiang, Can Lin, Weiyi Meng, Clement Yu, Aaron M. Cohen and Neil R. Smalheiser: Rule-based deduplication of article records from bibliographic databases, in: Database 2014, ID bat086, doi:10.1093/database/bat086

DEPOZITUL GATE.IO NU APARE

If you want to manually merge the records which are duplicates according to DedupEndNote, you can use Mark mode.

In Mark mode the ID of the first record of a set of duplicate records is copied to the label field ("LB") of all duplicate records. The input file is copied to the output file with the addition of the Label field if it is not empty. The original content of the Label field is overwritten! All other fields are copied as is (so no enriching of the output: prescribed form of DOI, ...).

After importing the results file into a new EndNote database, make the Label field visible. The IDs in the Label field refer to the IDs of the original EndNote database.

To see the duplicates, search for "Label" "is greater than" 0, and sort on the label field.
To manually merge, change in Preferences / Deduplicate the fields to deduplicate on to the Label field, deduplicate (in EndNote) and merge at will

Comparing the results of deduplication by EndNote itself and DedupEndNote:

Import the result file into a new EndNote database
Deduplicate by EndNote itself
Select all records in the "Duplicate References" set, and mark them as Read
Select the "All References" set, limit to the duplicates found by DedupEndNote (search for "Label" "is greater than" 0), and sort on the label field.
The records marked as Unread were identified as duplicates by DedupEndNote, but not by EndNote itself.
Select the "All References" set, limit to the duplicates not found by DedupEndNote (search for "Label" "is less than" 0).
The records marked as Read were identified as duplicates by EndNote itself, but not by DedupEndNote.

GATE IO DEPOZIT FĂRĂ COMISION

In the following table the results of EndNote's Find duplicates is compared to the comparisons in DedupEndNote. For these tests only one field was selected in EndNote in "Edit > Preferences > Duplicates", and "Ignore spacing and punctuation" was selected.

Field	Examples	EndNote finds duplicates	DedupEndNote Score
Starting page and article number	... ...	???	???
Title	90Y radioembolization using resin microspheres in patients with hepatocellular carcinoma and portal vein thrombosis 90Y RADIOEMBOLIZATION USING RESIN MICROSPHERES IN PATIENTS WITH HEPATOCELLULAR CARCINOMA AND PORTAL VEIN THROMBOSIS	Yes	1.00 == Yes
Title	Comments about Glisson's capsule phleboliths and portal vein thrombosis [1] COMMENTS ABOUT GLISSON CAPSULE PHLEBOLITHS AND PORTAL-VEIN THROMBOSIS	No	0.92 == Yes
Title	Transarterial chemoembolization and <sup>90</sup>y radioembolization for hepatocellular carcinoma: Review of current applications beyond intermediate-stage disease Transarterial Chemoembolization and Y-90 Radioembolization for Hepatocellular Carcinoma: Review of Current Applications Beyond Intermediate-Stage Disease	No	0.92 == Yes
Title	Epidemiology and diagnosis profile of digestive cancer in teaching hospital campus of lome: About 250 cases. [French] Epidemiology and diagnosis profile of digestive cancer in teaching Hospital Campus of Lome: about 250 cases	No	0.99 == Yes
Title	Post Splenectomy Outcome in beta-Thalassemia Post Splenectomy Outcome in β-Thalassemia	No	0.96 == Yes
Title	Letter: portal vein obstruction--which subset of patients could benefit the most? Authors' reply Letter: Portal vein obstruction - Which subset of patients could benefit the most?	No	0.97 == Yes *
Title	Title: Some diseases associated with ulcero- hemorrhagic colitis: complication or coincidence. [French] Original Title: Quelques maladies associees a la colite ulcero- Hemorragique: Complications ou coincidences Title: [Various diseases associated with ulcero-hemorrhagic colitis: complications or coincidences] Original Title: Quelques maladies associees a la colite ulcero-hemorragique: complications ou coincidences.	No	1.00 == Yes *
Title	Title: [HELLP in the second trimester in a patient with antiphospholipid syndrome] Original Title: HELLP kan ses i andet trimester ved antifosfolipidsyndrom. Title: HELLP kan ses i andet trimester ved antifosfolipidsyndrom	No	1.00 == Yes *
Title	Title: NFkappaB inhibition decreases hepatocyte proliferation but does not alter apoptosis in obstructive jaundice Reversed Title: ecidnuaj evitcurtsbo ni sisotpopa retla ton seod tub noitarefilorp etycotapeh sesaerced noitibihni BappakFN Title: NF kappa B inhibition decreases hepatocyte proliferation but does not alter apoptosis in obstructive jaundice Reversed title: ecidnuaj evitcurtsbo ni sisotpopa retla ton seod tub noitarefilorp etycotapeh sesaerced noitibihni B appak FN	No	1.00 == Yes *
Title	Title: Case report. Duplication of the portal vein: a rare congenital anomaly Reversed Title: ylamona latinegnoc erar a :niev latrop eht fo noitacilpuD .troper esaC Title: Duplication of the portal vein - A rare congenital anomaly Reversed title: ylamona latinegnoc erar A - niev latrop eht fo noitacilpuD	No	0.96 == Yes *
Title	Title: La sémantique de l'image radiologique. Intérêt du procédé de soustraction électronique en couleurs d'Oosterkamp en angiographie abdominale Reversed Title: elanimodba eihpargoigna ne pmakretsoO'd srueluoc ne euqinortcelé noitcartsuos ed édécorp ud têrétnI .euqigoloidar egami'l ed euqitnamés aL Title: INTERET DU PROCEDE DE SOUSTRACTION ELECTRONIQUE EN COULEURS D'OOSTERKAMP EN ANGIOGRAPHIE ABDOMINALE Reversed title: ELANIMODBA EIHPARGOIGNA NE PMAKRETSOO'D SRUELUOC NE EUQINORTCELE NOITCARTSUOS ED EDECORP UD TERETNI	No	0.91 == Yes *
Authors	Cobos Mateos, J. M.; Aguinaga Manzanos, M. V.; Casas Pinillos, M. S.; Gonzalez Conde, R.; Gonzalez Sanchez, J. A.; De Miguel Velasco, J. E.; Soleto Saez, E.; Suarez Mier, M. P. Mateos, J. M. C.; Manzanos, M. V. A.; Pinillos, M. S. C.; Conde, R. G.; Sanchez, J. A. G.; Velasco, J. E. D.; Saez, E. S.; Mier, M. P. S.	No	0.75 == Yes
Authors	Danilă, M.; Sporea, I.; Popescu, A.; şirli, R. Danila, M.; Sporea, I.; Popescu, A.; Sirli, R.	No	0.93 == Yes
Authors	Lv, Y.; Qi, X.; Xia, J.; Fan, D.; Han, G. Lv, Y.; Qi, X. S.; Xia, J. L.; Fan, D. M.; Han, G. H.	No	0.90 == Yes
Authors	[empty] Anonymous,	No	Yes
Journal	British journal of surgery Br J Surg	No	Similar == Yes
Journal	European Journal of Gastroenterology and Hepatology European Journal of Gastroenterology & Hepatology	No	Similar == Yes
Journal + ISSN	Japanese Journal of Cancer and Chemotherapy [ISSN: 2690-2692] Gan To Kagaku Ryoho [ISSN: 2690-2692]	No	Similar == Yes
Journal	JAMA Journal of the American Medical Association	No	Similar == Yes
Journal	The Lancet Haematology Lancet Haematol	No	Similar == Yes
Journal	Hepatology Hepatology International	No	Similar == Yes *
Journal	AJR Am J Roentgenol American Journal of Roentgenology	No	Similar == Yes
Journal	British journal of surgery Surgery	No	NOT similar == No

*: In these cases the comparison of DedupEndNote for this content for this field is not accurate. However, the comparison of the other fields for these records does not result in YES answers, so the records are ultimately not considered duplicates.

GATE.IO COMERȚ ÎN MARJĂ

See here for comparison with other tools and published test sets.

POARTA IO COMISIOANE DE RETRAGERE

See gate.io for a description of this test database.

Tool	Setting	Duplicates found	Duplicates to delete	After deduplication	% kept
EndNote	Author + Year + Title + Reference Type (default setting)	32,891	19,959	32,869	62%
EndNote	Author + Year + Title	32,920	19,976	32,852	62%
EndNote	Author + Year + Title + Secondary Title (Journal)	22,120	12,333	40,495	77%
DedupEndNote		38,407	24,201	28,627	54%

Validation

A general subset of 4923 records was manually checked (150 records without an Author, 79 with Author "Anonymous", the 366 articles with "phase" in the title (because of "Phase I", Phase I/II", ...) the rest: "Aagaard" ... "Abbitt", "Anoob" ... "Axelrod", "Liu", "v Koppenfels" ... "von Woellwarth", and some others). Because replies are treated in a special way, 130 records with "reply" in the title which were identified as duplicates, were checked, but only for true and false positives.

Set	True positive	False negative	Sensitivity	True negative	False positive	Specificity
	correctly identified duplicates	missed duplicates		correctly identified unique records	incorrectly identified as duplicates
General	3681	275	93.0%	966	1	99.9%
Replies	130				0

The False Positives are all conference abstracts:

Cool, J.; Rosenblatt, R.; Kumar, S.; Lucero, C.; Fortune, B.; Crawford, C. V.; Jesudian, A.: THE ASSOCIATION BETWEEN PORTAL VEIN THROMBOSIS AND OTHER VENOUS THROMBOEMBOLISM IN CIRRHOSIS: ANALYSIS OF A NATIONALLY REPRESENTATIVE INPATIENT COHORT
In: Gastroenterology 154 (6), 2018, p. S1178-S1179
Cool, J.; Rosenblatt, R.; Kumar, S.; Lucero, C.; Fortune, B.; Crawford, C. V.; Jesudian, A.: TRENDS IN THE PREVALENCE OF PORTAL VEIN THROMBOSIS AND ASSOCIATED MORTALITY IN CIRRHOSIS: ANALYSIS OF A NATIONALLY REPRESENTATIVE INPATIENT COHORT
In: Gastroenterology 154 (6 Supplement 1), 2018, p. S-1178 DOI: 10.1016/s0016-5085(18)33901-5

TWITCH STEINS GATE GLITCH IOS

Input file size: The maximum size of the input file is limited to 150MB.
Input file format: The program only handles files in RIS format, not in XML or CSV format.
Input file encoding: The program assumes that the input file is encoded as UTF-8.
If authors AND (all) titles AND (all) journal names for a record use a non-Latin script, results for this record may be inaccurate.
(when deduplicating one file:) The input file must be an export from ONE EndNote database: the ID fields are used internally for identifying the records, so they have to be unique. However, if the RIS file does not have an ID field in the first publication, DedupEndNote assumes the whole file has no ID fields and gives every publication an ID (starting with 1). This has only been tested on Zotero files!
The program has been developed and tested for biomedical databases (PubMed, EMBASE, ...) and some general databases (Web of Science, Scopus). The data sets used were the results of biomedical queries. Deduplicating records from other databases is not guaranteed to work, and performance is often very poor, esp. for non journal articles (see Justification. 2.5).
The program uses a bibliographic point of view:
- an article or conference abstract that has been published in more than one (issue of a) journal is not considered a duplicate publication.
- Records for each publication year are compared to records from the same and the following year: a record from 2016 is compared to the records from 2015 (when treating the records from 2015) and from 2016 and 2017 (when treating the records from 2016). A PubMed ahead-of-print record from 2013 and a corresponding record from 2017 (when it was 'officially' published) will not be compared (and possibly deduplicated).
- Bibliographic databases are not always very accurate in the starting page of a publication. Because starting page is part of the comparisons, DedupEndNote misses the duplicates when bibliographic databases don't agree on the starting page (and one or both records have no DOIs).

ESTE POARTA IO UN SCHIMB

If you have any questions about the tool or come across a problem when trying to use it, please raise an issue on the GitHub Repository.

GATEIO RETRAGERE MINIMĂ

If you use this software, please cite it as:
Lobbestael, G. (2023). DedupEndNote (Version 1.0.0) [Computer software]. https://github.com/globbestael/DedupEndNote

COMOARA IMPERIALĂ IO POARTA NESFÂRȘITĂ

2024-01-14: 1.0.1: UG: oesn't handle author last name starting with ahyphen
2023-09-02: ISBN-10 and ISBN-13 are compared correctly: 9024274214 = 9789024274215
2023-09-02: BUG: Publication years of form "1-1-1989" (instead of "1989") are parsed as years
2023-08-21: If a title with ": " has at least 50 characters before the ": ", that first part is also used as a title variant
2023-08-21: BUG: Result file for deduplication of two files in Mark mode was not always written
2023-08-18: "How to cite" added (CITATION.cff)
2023-08-12: Files without IDs (record numbers) for each record (e.g. Zotero), get an ID assigned by DedupEndNote
2023-08-12: Deduplication of Conference proceedings from Web of Science was very poor (too many false positives). This problem has been solved