How to use JohnSnowLabs NLP Spell correction module NorvigSweetingModel?
I was going through the JohnSnowLabs SpellChecker here.
I found the Norvig
's algorithm implementation there, and the example section has just the following two lines:
import com.johnsnowlabs.nlp.annotator.NorvigSweetingModel
NorvigSweetingModel.pretrained()
Can anyone please help me on how to apply this pretrained model on my dataframe (df
)below for spell correcting the "names
" column.
+----------------+---+------------+
| names|age| color|
+----------------+---+------------+
| [abc, cde]| 19| red, abc|
|[eefg, efa, efb]|192|efg, efz efz|
+----------------+---+------------+
I have tried to do it as follows:
val schk = NorvigSweetingModel.pretrained().setInputCols("names").setOutputCol("Corrected")
val cdf = schk.transform(df)
But the above code gave me the following error:
java.lang.IllegalArgumentException: requirement failed: Wrong or missing inputCols annotators in SPELL_a1f11bacb851. Received inputCols: names. Make sure such columns have following annotator types: token
at scala.Predef$.require(Predef.scala:224)
at com.johnsnowlabs.nlp.AnnotatorModel.transform(AnnotatorModel.scala:51)
... 49 elided
Thanks.
scala apache-spark nlp apache-spark-ml johnsnowlabs-spark-nlp
add a comment |
I was going through the JohnSnowLabs SpellChecker here.
I found the Norvig
's algorithm implementation there, and the example section has just the following two lines:
import com.johnsnowlabs.nlp.annotator.NorvigSweetingModel
NorvigSweetingModel.pretrained()
Can anyone please help me on how to apply this pretrained model on my dataframe (df
)below for spell correcting the "names
" column.
+----------------+---+------------+
| names|age| color|
+----------------+---+------------+
| [abc, cde]| 19| red, abc|
|[eefg, efa, efb]|192|efg, efz efz|
+----------------+---+------------+
I have tried to do it as follows:
val schk = NorvigSweetingModel.pretrained().setInputCols("names").setOutputCol("Corrected")
val cdf = schk.transform(df)
But the above code gave me the following error:
java.lang.IllegalArgumentException: requirement failed: Wrong or missing inputCols annotators in SPELL_a1f11bacb851. Received inputCols: names. Make sure such columns have following annotator types: token
at scala.Predef$.require(Predef.scala:224)
at com.johnsnowlabs.nlp.AnnotatorModel.transform(AnnotatorModel.scala:51)
... 49 elided
Thanks.
scala apache-spark nlp apache-spark-ml johnsnowlabs-spark-nlp
add a comment |
I was going through the JohnSnowLabs SpellChecker here.
I found the Norvig
's algorithm implementation there, and the example section has just the following two lines:
import com.johnsnowlabs.nlp.annotator.NorvigSweetingModel
NorvigSweetingModel.pretrained()
Can anyone please help me on how to apply this pretrained model on my dataframe (df
)below for spell correcting the "names
" column.
+----------------+---+------------+
| names|age| color|
+----------------+---+------------+
| [abc, cde]| 19| red, abc|
|[eefg, efa, efb]|192|efg, efz efz|
+----------------+---+------------+
I have tried to do it as follows:
val schk = NorvigSweetingModel.pretrained().setInputCols("names").setOutputCol("Corrected")
val cdf = schk.transform(df)
But the above code gave me the following error:
java.lang.IllegalArgumentException: requirement failed: Wrong or missing inputCols annotators in SPELL_a1f11bacb851. Received inputCols: names. Make sure such columns have following annotator types: token
at scala.Predef$.require(Predef.scala:224)
at com.johnsnowlabs.nlp.AnnotatorModel.transform(AnnotatorModel.scala:51)
... 49 elided
Thanks.
scala apache-spark nlp apache-spark-ml johnsnowlabs-spark-nlp
I was going through the JohnSnowLabs SpellChecker here.
I found the Norvig
's algorithm implementation there, and the example section has just the following two lines:
import com.johnsnowlabs.nlp.annotator.NorvigSweetingModel
NorvigSweetingModel.pretrained()
Can anyone please help me on how to apply this pretrained model on my dataframe (df
)below for spell correcting the "names
" column.
+----------------+---+------------+
| names|age| color|
+----------------+---+------------+
| [abc, cde]| 19| red, abc|
|[eefg, efa, efb]|192|efg, efz efz|
+----------------+---+------------+
I have tried to do it as follows:
val schk = NorvigSweetingModel.pretrained().setInputCols("names").setOutputCol("Corrected")
val cdf = schk.transform(df)
But the above code gave me the following error:
java.lang.IllegalArgumentException: requirement failed: Wrong or missing inputCols annotators in SPELL_a1f11bacb851. Received inputCols: names. Make sure such columns have following annotator types: token
at scala.Predef$.require(Predef.scala:224)
at com.johnsnowlabs.nlp.AnnotatorModel.transform(AnnotatorModel.scala:51)
... 49 elided
Thanks.
scala apache-spark nlp apache-spark-ml johnsnowlabs-spark-nlp
scala apache-spark nlp apache-spark-ml johnsnowlabs-spark-nlp
edited Nov 28 '18 at 5:13
Community♦
11
11
asked Nov 21 '18 at 18:15
user3243499user3243499
74611126
74611126
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
spark-nlp
are designed to be used in its own specific pipelines and input columns for different transformers have to include special metadata.
The exception already tells you that input to the NorvigSweetingModel
should be tokenized:
Make sure such columns have following annotator types: token
If I am not mistaken, at minimum you'll have assemble documents and tokenized here.
import com.johnsnowlabs.nlp.DocumentAssembler
import com.johnsnowlabs.nlp.annotator.NorvigSweetingModel
import com.johnsnowlabs.nlp.annotators.Tokenizer
import org.apache.spark.ml.Pipeline
val df = Seq(Seq("abc", "cde"), Seq("eefg", "efa", "efb")).toDF("names")
val nlpPipeline = new Pipeline().setStages(Array(
new DocumentAssembler().setInputCol("names").setOutputCol("document"),
new Tokenizer().setInputCols("document").setOutputCol("tokens"),
NorvigSweetingModel.pretrained().setInputCols("tokens").setOutputCol("corrected")
))
A Pipeline
like this, can be applied on your data with small adjustment - input data has to be string
not array<string>
*:
val result = df
.transform(_.withColumn("names", concat_ws(" ", $"names")))
.transform(df => nlpPipeline.fit(df).transform(df))
result.show()
+------------+--------------------+--------------------+--------------------+
| names| document| tokens| corrected|
+------------+--------------------+--------------------+--------------------+
| abc cde|[[document, 0, 6,...|[[token, 0, 2, ab...|[[token, 0, 2, ab...|
|eefg efa efb|[[document, 0, 11...|[[token, 0, 3, ee...|[[token, 0, 3, ee...|
+------------+--------------------+--------------------+--------------------+
If you want an output that can be exported you should extend your Pipeline
with Finisher
.
import com.johnsnowlabs.nlp.Finisher
new Finisher().setInputCols("corrected").transform(result).show
+------------+------------------+
| names|finished_corrected|
+------------+------------------+
| abc cde| [abc, cde]|
|eefg efa efb| [eefg, efa, efb]|
+------------+------------------+
* According to the docs DocumentAssembler
can read either a String column or an Array[String]
but it doesn't look like it works in practice in 1.7.3:
df.transform(df => nlpPipeline.fit(df).transform(df)).show()
org.apache.spark.sql.AnalysisException: cannot resolve 'UDF(names)' due to data type mismatch: argument 1 requires string type, however, '`names`' is of array<string> type.;;
'Project [names#62, UDF(names#62) AS document#343]
+- AnalysisBarrier
+- Project [value#60 AS names#62]
+- LocalRelation [value#60]
How to get the spell corrected values. Values under "corrected" comes as[[token, 0, 3, eefg, [sentence -> 1]], [token, 5, 7, efa, [sentence -> 1]], [token, 9, 11, efb, [sentence -> 1]]]
– user3243499
Nov 21 '18 at 19:21
Does each of this list items, like [sentence ->1], have any standard structure/meaning?
– user3243499
Nov 21 '18 at 19:23
@user3243499 How to get the spell corrected values - Please check theFinisher
part.
– user10465355
Nov 21 '18 at 20:18
Does each of this list items, like [sentence ->1], have any standard structure/meaning? - it is metadata. It ismap<string,string>
so structure is not fixed, but in this case it contains information about the sentence form the document.
– user10465355
Nov 21 '18 at 20:21
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53418267%2fhow-to-use-johnsnowlabs-nlp-spell-correction-module-norvigsweetingmodel%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
spark-nlp
are designed to be used in its own specific pipelines and input columns for different transformers have to include special metadata.
The exception already tells you that input to the NorvigSweetingModel
should be tokenized:
Make sure such columns have following annotator types: token
If I am not mistaken, at minimum you'll have assemble documents and tokenized here.
import com.johnsnowlabs.nlp.DocumentAssembler
import com.johnsnowlabs.nlp.annotator.NorvigSweetingModel
import com.johnsnowlabs.nlp.annotators.Tokenizer
import org.apache.spark.ml.Pipeline
val df = Seq(Seq("abc", "cde"), Seq("eefg", "efa", "efb")).toDF("names")
val nlpPipeline = new Pipeline().setStages(Array(
new DocumentAssembler().setInputCol("names").setOutputCol("document"),
new Tokenizer().setInputCols("document").setOutputCol("tokens"),
NorvigSweetingModel.pretrained().setInputCols("tokens").setOutputCol("corrected")
))
A Pipeline
like this, can be applied on your data with small adjustment - input data has to be string
not array<string>
*:
val result = df
.transform(_.withColumn("names", concat_ws(" ", $"names")))
.transform(df => nlpPipeline.fit(df).transform(df))
result.show()
+------------+--------------------+--------------------+--------------------+
| names| document| tokens| corrected|
+------------+--------------------+--------------------+--------------------+
| abc cde|[[document, 0, 6,...|[[token, 0, 2, ab...|[[token, 0, 2, ab...|
|eefg efa efb|[[document, 0, 11...|[[token, 0, 3, ee...|[[token, 0, 3, ee...|
+------------+--------------------+--------------------+--------------------+
If you want an output that can be exported you should extend your Pipeline
with Finisher
.
import com.johnsnowlabs.nlp.Finisher
new Finisher().setInputCols("corrected").transform(result).show
+------------+------------------+
| names|finished_corrected|
+------------+------------------+
| abc cde| [abc, cde]|
|eefg efa efb| [eefg, efa, efb]|
+------------+------------------+
* According to the docs DocumentAssembler
can read either a String column or an Array[String]
but it doesn't look like it works in practice in 1.7.3:
df.transform(df => nlpPipeline.fit(df).transform(df)).show()
org.apache.spark.sql.AnalysisException: cannot resolve 'UDF(names)' due to data type mismatch: argument 1 requires string type, however, '`names`' is of array<string> type.;;
'Project [names#62, UDF(names#62) AS document#343]
+- AnalysisBarrier
+- Project [value#60 AS names#62]
+- LocalRelation [value#60]
How to get the spell corrected values. Values under "corrected" comes as[[token, 0, 3, eefg, [sentence -> 1]], [token, 5, 7, efa, [sentence -> 1]], [token, 9, 11, efb, [sentence -> 1]]]
– user3243499
Nov 21 '18 at 19:21
Does each of this list items, like [sentence ->1], have any standard structure/meaning?
– user3243499
Nov 21 '18 at 19:23
@user3243499 How to get the spell corrected values - Please check theFinisher
part.
– user10465355
Nov 21 '18 at 20:18
Does each of this list items, like [sentence ->1], have any standard structure/meaning? - it is metadata. It ismap<string,string>
so structure is not fixed, but in this case it contains information about the sentence form the document.
– user10465355
Nov 21 '18 at 20:21
add a comment |
spark-nlp
are designed to be used in its own specific pipelines and input columns for different transformers have to include special metadata.
The exception already tells you that input to the NorvigSweetingModel
should be tokenized:
Make sure such columns have following annotator types: token
If I am not mistaken, at minimum you'll have assemble documents and tokenized here.
import com.johnsnowlabs.nlp.DocumentAssembler
import com.johnsnowlabs.nlp.annotator.NorvigSweetingModel
import com.johnsnowlabs.nlp.annotators.Tokenizer
import org.apache.spark.ml.Pipeline
val df = Seq(Seq("abc", "cde"), Seq("eefg", "efa", "efb")).toDF("names")
val nlpPipeline = new Pipeline().setStages(Array(
new DocumentAssembler().setInputCol("names").setOutputCol("document"),
new Tokenizer().setInputCols("document").setOutputCol("tokens"),
NorvigSweetingModel.pretrained().setInputCols("tokens").setOutputCol("corrected")
))
A Pipeline
like this, can be applied on your data with small adjustment - input data has to be string
not array<string>
*:
val result = df
.transform(_.withColumn("names", concat_ws(" ", $"names")))
.transform(df => nlpPipeline.fit(df).transform(df))
result.show()
+------------+--------------------+--------------------+--------------------+
| names| document| tokens| corrected|
+------------+--------------------+--------------------+--------------------+
| abc cde|[[document, 0, 6,...|[[token, 0, 2, ab...|[[token, 0, 2, ab...|
|eefg efa efb|[[document, 0, 11...|[[token, 0, 3, ee...|[[token, 0, 3, ee...|
+------------+--------------------+--------------------+--------------------+
If you want an output that can be exported you should extend your Pipeline
with Finisher
.
import com.johnsnowlabs.nlp.Finisher
new Finisher().setInputCols("corrected").transform(result).show
+------------+------------------+
| names|finished_corrected|
+------------+------------------+
| abc cde| [abc, cde]|
|eefg efa efb| [eefg, efa, efb]|
+------------+------------------+
* According to the docs DocumentAssembler
can read either a String column or an Array[String]
but it doesn't look like it works in practice in 1.7.3:
df.transform(df => nlpPipeline.fit(df).transform(df)).show()
org.apache.spark.sql.AnalysisException: cannot resolve 'UDF(names)' due to data type mismatch: argument 1 requires string type, however, '`names`' is of array<string> type.;;
'Project [names#62, UDF(names#62) AS document#343]
+- AnalysisBarrier
+- Project [value#60 AS names#62]
+- LocalRelation [value#60]
How to get the spell corrected values. Values under "corrected" comes as[[token, 0, 3, eefg, [sentence -> 1]], [token, 5, 7, efa, [sentence -> 1]], [token, 9, 11, efb, [sentence -> 1]]]
– user3243499
Nov 21 '18 at 19:21
Does each of this list items, like [sentence ->1], have any standard structure/meaning?
– user3243499
Nov 21 '18 at 19:23
@user3243499 How to get the spell corrected values - Please check theFinisher
part.
– user10465355
Nov 21 '18 at 20:18
Does each of this list items, like [sentence ->1], have any standard structure/meaning? - it is metadata. It ismap<string,string>
so structure is not fixed, but in this case it contains information about the sentence form the document.
– user10465355
Nov 21 '18 at 20:21
add a comment |
spark-nlp
are designed to be used in its own specific pipelines and input columns for different transformers have to include special metadata.
The exception already tells you that input to the NorvigSweetingModel
should be tokenized:
Make sure such columns have following annotator types: token
If I am not mistaken, at minimum you'll have assemble documents and tokenized here.
import com.johnsnowlabs.nlp.DocumentAssembler
import com.johnsnowlabs.nlp.annotator.NorvigSweetingModel
import com.johnsnowlabs.nlp.annotators.Tokenizer
import org.apache.spark.ml.Pipeline
val df = Seq(Seq("abc", "cde"), Seq("eefg", "efa", "efb")).toDF("names")
val nlpPipeline = new Pipeline().setStages(Array(
new DocumentAssembler().setInputCol("names").setOutputCol("document"),
new Tokenizer().setInputCols("document").setOutputCol("tokens"),
NorvigSweetingModel.pretrained().setInputCols("tokens").setOutputCol("corrected")
))
A Pipeline
like this, can be applied on your data with small adjustment - input data has to be string
not array<string>
*:
val result = df
.transform(_.withColumn("names", concat_ws(" ", $"names")))
.transform(df => nlpPipeline.fit(df).transform(df))
result.show()
+------------+--------------------+--------------------+--------------------+
| names| document| tokens| corrected|
+------------+--------------------+--------------------+--------------------+
| abc cde|[[document, 0, 6,...|[[token, 0, 2, ab...|[[token, 0, 2, ab...|
|eefg efa efb|[[document, 0, 11...|[[token, 0, 3, ee...|[[token, 0, 3, ee...|
+------------+--------------------+--------------------+--------------------+
If you want an output that can be exported you should extend your Pipeline
with Finisher
.
import com.johnsnowlabs.nlp.Finisher
new Finisher().setInputCols("corrected").transform(result).show
+------------+------------------+
| names|finished_corrected|
+------------+------------------+
| abc cde| [abc, cde]|
|eefg efa efb| [eefg, efa, efb]|
+------------+------------------+
* According to the docs DocumentAssembler
can read either a String column or an Array[String]
but it doesn't look like it works in practice in 1.7.3:
df.transform(df => nlpPipeline.fit(df).transform(df)).show()
org.apache.spark.sql.AnalysisException: cannot resolve 'UDF(names)' due to data type mismatch: argument 1 requires string type, however, '`names`' is of array<string> type.;;
'Project [names#62, UDF(names#62) AS document#343]
+- AnalysisBarrier
+- Project [value#60 AS names#62]
+- LocalRelation [value#60]
spark-nlp
are designed to be used in its own specific pipelines and input columns for different transformers have to include special metadata.
The exception already tells you that input to the NorvigSweetingModel
should be tokenized:
Make sure such columns have following annotator types: token
If I am not mistaken, at minimum you'll have assemble documents and tokenized here.
import com.johnsnowlabs.nlp.DocumentAssembler
import com.johnsnowlabs.nlp.annotator.NorvigSweetingModel
import com.johnsnowlabs.nlp.annotators.Tokenizer
import org.apache.spark.ml.Pipeline
val df = Seq(Seq("abc", "cde"), Seq("eefg", "efa", "efb")).toDF("names")
val nlpPipeline = new Pipeline().setStages(Array(
new DocumentAssembler().setInputCol("names").setOutputCol("document"),
new Tokenizer().setInputCols("document").setOutputCol("tokens"),
NorvigSweetingModel.pretrained().setInputCols("tokens").setOutputCol("corrected")
))
A Pipeline
like this, can be applied on your data with small adjustment - input data has to be string
not array<string>
*:
val result = df
.transform(_.withColumn("names", concat_ws(" ", $"names")))
.transform(df => nlpPipeline.fit(df).transform(df))
result.show()
+------------+--------------------+--------------------+--------------------+
| names| document| tokens| corrected|
+------------+--------------------+--------------------+--------------------+
| abc cde|[[document, 0, 6,...|[[token, 0, 2, ab...|[[token, 0, 2, ab...|
|eefg efa efb|[[document, 0, 11...|[[token, 0, 3, ee...|[[token, 0, 3, ee...|
+------------+--------------------+--------------------+--------------------+
If you want an output that can be exported you should extend your Pipeline
with Finisher
.
import com.johnsnowlabs.nlp.Finisher
new Finisher().setInputCols("corrected").transform(result).show
+------------+------------------+
| names|finished_corrected|
+------------+------------------+
| abc cde| [abc, cde]|
|eefg efa efb| [eefg, efa, efb]|
+------------+------------------+
* According to the docs DocumentAssembler
can read either a String column or an Array[String]
but it doesn't look like it works in practice in 1.7.3:
df.transform(df => nlpPipeline.fit(df).transform(df)).show()
org.apache.spark.sql.AnalysisException: cannot resolve 'UDF(names)' due to data type mismatch: argument 1 requires string type, however, '`names`' is of array<string> type.;;
'Project [names#62, UDF(names#62) AS document#343]
+- AnalysisBarrier
+- Project [value#60 AS names#62]
+- LocalRelation [value#60]
edited Nov 21 '18 at 23:26
answered Nov 21 '18 at 19:04
user10465355user10465355
1,3781413
1,3781413
How to get the spell corrected values. Values under "corrected" comes as[[token, 0, 3, eefg, [sentence -> 1]], [token, 5, 7, efa, [sentence -> 1]], [token, 9, 11, efb, [sentence -> 1]]]
– user3243499
Nov 21 '18 at 19:21
Does each of this list items, like [sentence ->1], have any standard structure/meaning?
– user3243499
Nov 21 '18 at 19:23
@user3243499 How to get the spell corrected values - Please check theFinisher
part.
– user10465355
Nov 21 '18 at 20:18
Does each of this list items, like [sentence ->1], have any standard structure/meaning? - it is metadata. It ismap<string,string>
so structure is not fixed, but in this case it contains information about the sentence form the document.
– user10465355
Nov 21 '18 at 20:21
add a comment |
How to get the spell corrected values. Values under "corrected" comes as[[token, 0, 3, eefg, [sentence -> 1]], [token, 5, 7, efa, [sentence -> 1]], [token, 9, 11, efb, [sentence -> 1]]]
– user3243499
Nov 21 '18 at 19:21
Does each of this list items, like [sentence ->1], have any standard structure/meaning?
– user3243499
Nov 21 '18 at 19:23
@user3243499 How to get the spell corrected values - Please check theFinisher
part.
– user10465355
Nov 21 '18 at 20:18
Does each of this list items, like [sentence ->1], have any standard structure/meaning? - it is metadata. It ismap<string,string>
so structure is not fixed, but in this case it contains information about the sentence form the document.
– user10465355
Nov 21 '18 at 20:21
How to get the spell corrected values. Values under "corrected" comes as
[[token, 0, 3, eefg, [sentence -> 1]], [token, 5, 7, efa, [sentence -> 1]], [token, 9, 11, efb, [sentence -> 1]]]
– user3243499
Nov 21 '18 at 19:21
How to get the spell corrected values. Values under "corrected" comes as
[[token, 0, 3, eefg, [sentence -> 1]], [token, 5, 7, efa, [sentence -> 1]], [token, 9, 11, efb, [sentence -> 1]]]
– user3243499
Nov 21 '18 at 19:21
Does each of this list items, like [sentence ->1], have any standard structure/meaning?
– user3243499
Nov 21 '18 at 19:23
Does each of this list items, like [sentence ->1], have any standard structure/meaning?
– user3243499
Nov 21 '18 at 19:23
@user3243499 How to get the spell corrected values - Please check the
Finisher
part.– user10465355
Nov 21 '18 at 20:18
@user3243499 How to get the spell corrected values - Please check the
Finisher
part.– user10465355
Nov 21 '18 at 20:18
Does each of this list items, like [sentence ->1], have any standard structure/meaning? - it is metadata. It is
map<string,string>
so structure is not fixed, but in this case it contains information about the sentence form the document.– user10465355
Nov 21 '18 at 20:21
Does each of this list items, like [sentence ->1], have any standard structure/meaning? - it is metadata. It is
map<string,string>
so structure is not fixed, but in this case it contains information about the sentence form the document.– user10465355
Nov 21 '18 at 20:21
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53418267%2fhow-to-use-johnsnowlabs-nlp-spell-correction-module-norvigsweetingmodel%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown