This e-book constitutes the refereed complaints of the 3rd foreign convention on Statistical Language and Speech Processing, SLSP 2015, held in Budapest, Hungary, in November 2015.

The 26 complete papers provided including invited talks have been conscientiously reviewed and chosen from seventy one submissions. The papers hide themes resembling: anaphora and coreference solution; authorship id, plagiarism and unsolicited mail filtering; computer-aided translation; corpora and language assets; facts mining and semantic net; info extraction; details retrieval; wisdom illustration and ontologies; lexicons and dictionaries; computing device translation; multimodal applied sciences; ordinary language figuring out; neural illustration of speech and language; opinion mining and sentiment research; parsing; part-of-speech tagging; question-answering structures; semantic position labelling; speaker identity and verification; speech and language iteration; speech popularity; speech synthesis; speech transcription; spelling correction; spoken discussion structures; time period extraction; textual content categorisation; textual content summarisation; and person modeling.

2 23 Lemmas and POS The second series of experiments use both lemmas and the POS tag included in Ancora as features. The rationale behind the use of a precalculated POS tag is that it might help focus the tagger on the categories we want: all the categories except “v” should be tagged exactly as the POS tag feature indicates; the only categories with new information are subclasses of “v”. This could help the tagger not to mix the “v” tokens with other categories. In the previews experiments we saw that the training of taggers for such a number of different tags might be quite slow (the longest experiment took around 36 h), so we decided to break the training corpus into smaller corpora.

The second argument is arg1 (theme), it is also a noun phrase. The third argument is arg2 (beneficiary) and it is a prepositional phrase with head “a”. This is a possible subclass for the verb “ofrecer” (“to offer”) as seen in the corpus. To sum up, the information included in the verb class tag is: • Argument structure values (arg0, arg1, arg2, arg3, arg4, argM, argL or our argC). • Syntactic category for each argument (sn, S, sa, sadv, sp or v). In the case of a preprositional phrase, we include the preposition: sp_a, sp_de, sp_con, sp_en, or just sp for the rest.

64 % for a supertagger over the same base corpus, but with a much larger tagset in a different grammar framework. However, the corpus we used was only a subset of the original corpus because we had to prune many sentences that did not provide appropriate information about the verbs arguments. As the performance of the supertagger has not plateaued, we consider that the results could be further improved by adding more training data. One way of doing this would be annotating the arguments that are missing in the corpus, or trying to leverage the information contained other Spanish corpora annotated in different formalisms.

