---------------------------------------------------------------------------
UnicodeEncodeError Traceback (most recent call last)
Cell In[18], line 3431 topic_evolution[period] = lda_model33 # Visualize the topics for the month
---> 34 vis = pyLDAvis.gensim_models.prepare(lda_model, corpus_month, dictionary)35 pyLDAvis.save_html(vis, f"result_{period}.html")37 # Print topic distribution for each monthFile e:\abgrad\data\.venv\Lib\site-packages\pyLDAvis\gensim_models.py:123, in prepare(topic_model, corpus, dictionary, doc_topic_dist, **kwargs)78 """Transforms the Gensim TopicModel and related corpus and dictionary into79 the data structures needed for the visualization.80 (...)120 See pyLDAvis.prepare for **kwargs.121 """122 opts = fp.merge(_extract_data(topic_model, corpus, dictionary, doc_topic_dist), kwargs)
--> 123 return pyLDAvis.prepare(**opts)File e:\abgrad\data\.venv\Lib\site-packages\pyLDAvis\_prepare.py:432, in prepare(topic_term_dists, doc_topic_dists, doc_lengths, vocab, term_frequency, R, lambda_step, mds, n_jobs, plot_opts, sort_topics, start_index)426 # Quick fix for red bar width bug. We calculate the427 # term frequencies internally, using the topic term distributions and the428 # topic frequencies, rather than using the user-supplied term frequencies.429 # For a detailed discussion, see: https://github.com/cpsievert/LDAvis/pull/41430 term_frequency = np.sum(term_topic_freq, axis=0)
...
--> 196 msg = f"{cmd}:{name}:{rtype}\n".encode("ascii")197 nbytes = os.write(self._fd, msg)198 assert nbytes == len(msg)UnicodeEncodeError: 'ascii' codec can't encode characters in position 18-20: ordinal not in range(128)
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...
报错:UnicodeEncodeError: 'ascii' codec can't encode characters in position 18-20: ordinal not in range(128)
解决办法:
n_job的问题,因为默认并行处理时,系统文件名有中文就会报错,添加n_jobs=1即可,如下
vis = pyLDAvis.gensim_models.prepare(lda_model, corpus_month, dictionary, n_jobs = 1)