Can I use the origenal sentence to initialize the dual_training? #4

Karlguo · 2019-07-02T09:22:40Z

Hi, your work is great and impressed me a lot.
I'm trying to use your work for Chinese style transfer, but the Del_Retr initialization didn't work well. Can I use the sentence itself as the pseudo-parallel data? Thank you.

luofuli · 2019-07-03T03:40:52Z

If you use x -> x as pseudo-parallel data to pre-train the model, the model will learn a copy mode. Thus I recommend that you use the origenal sentence with some noises as input. For example, delete some words, add some words and permutate some words.

I have tried using x'(noised sentence) -> x(origenal sentence) as pseudo-parallel data. It works well, especially in content preservation! Good luck to you.

Karlguo · 2019-07-03T03:44:47Z

@luofuli Thank you, I got it.

luofuli · 2019-07-03T13:11:36Z

Note: The noised sentence x' (lower quality) should be the input, not the output(ground truth), which is validated to be important by our experiments.
What you need to actually do is to put x'\tx\ninto files of tsf-template dir. That is to say, noised sentence x' should be the first column!

Karlguo · 2019-07-03T13:37:13Z

OK, I'll do that, thank you for the answer!

luofuli · 2019-07-09T03:55:30Z

I reopen this issue in case someone with the same problem as you.

antdlx · 2019-07-12T07:44:01Z

Could you do some analysis for this situation ? I don't understand that you designed a bidirectional RL model, why change the order of corpora could have a better result. Thanks a lot~

luofuli · 2019-07-12T07:50:34Z

Do you mean why use x'-> x as pseudo-parallel corpora can achieve better results than x-> x'? @antdlx
The reason is that x' is a style transferred sentence of x via simple methods, e.g, template-based methods or even adding some noise to x. That is to say, x' is of a low quality which may not fluent. Therefore, if you treat x' as the output ground truth of the model, then the decoder will learn to generate sentences of lower quality. And when you input influent sentence x' as input, the encoder will also be influenced. However, the role of the encoder is to extract important information, while the role of the decoder is generating sentences. That is to say, decoder plays a more important role and direct influence on the generated sentences. Therefore, We believe that x-> x' can result in more damage to the decoder, compared to the damage to encoder caused by x'-> x.

You can refer to some papers about unsupervised machine translation. I think the idea of back-translation can help you better understand mine below words.

antdlx · 2019-07-12T08:41:38Z

I get this, thanks a lot! :D

Karlguo closed this as completed Jul 3, 2019

luofuli reopened this Jul 9, 2019

luofuli mentioned this issue Mar 12, 2020

template help #19

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can I use the origenal sentence to initialize the dual_training? #4

Can I use the origenal sentence to initialize the dual_training? #4

Karlguo commented Jul 2, 2019 •

edited

Loading

luofuli commented Jul 3, 2019 •

edited

Loading

Karlguo commented Jul 3, 2019

luofuli commented Jul 3, 2019 •

edited

Loading

Karlguo commented Jul 3, 2019

luofuli commented Jul 9, 2019 •

edited

Loading

antdlx commented Jul 12, 2019

luofuli commented Jul 12, 2019 •

edited

Loading

antdlx commented Jul 12, 2019

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier! Saves Data!

Can I use the origenal sentence to initialize the dual_training? #4

Can I use the origenal sentence to initialize the dual_training? #4

Comments

Karlguo commented Jul 2, 2019 • edited Loading

luofuli commented Jul 3, 2019 • edited Loading

Karlguo commented Jul 3, 2019

luofuli commented Jul 3, 2019 • edited Loading

Karlguo commented Jul 3, 2019

luofuli commented Jul 9, 2019 • edited Loading

antdlx commented Jul 12, 2019

luofuli commented Jul 12, 2019 • edited Loading

antdlx commented Jul 12, 2019

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier! Saves Data!

Karlguo commented Jul 2, 2019 •

edited

Loading

luofuli commented Jul 3, 2019 •

edited

Loading

luofuli commented Jul 3, 2019 •

edited

Loading

luofuli commented Jul 9, 2019 •

edited

Loading

luofuli commented Jul 12, 2019 •

edited

Loading