TypeError 'NoneType' object is not callable

问题描述

  1. 使用 datasets 的 load_dataset 函数从 huggingface 中加载数据集 Matthijs/cmu-arctic-xvectors ,无法加载成功,报错 TypeError: 'NoneType' object is not callable.

  2. 代码:

1
2
3
4
5
6
7
8
from datasets import load_dataset

embedding_dataset = load_dataset(
'Matthijs/cmu-arctic-xvectors',
split='validation',
)

embedding_dataset
  1. 报错:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[6], line 6
2 import logging
4 logging.basicConfig(level=logging.DEBUG)
----> 6 embedding_dataset = load_dataset('Matthijs/cmu-arctic-xvectors', split='validation')
8 embedding_dataset

File /opt/conda/envs/audio/lib/python3.10/site-packages/datasets/load.py:2129, in load_dataset(path, name, data_dir, data_files, split, cache_dir, features, download_config, download_mode, verification_mode, keep_in_memory, save_infos, revision, token, streaming, num_proc, storage_options, trust_remote_code, **config_kwargs)
2124 verification_mode = VerificationMode(
2125 (verification_mode or VerificationMode.BASIC_CHECKS) if not save_infos else VerificationMode.ALL_CHECKS
2126 )
2128 # Create a dataset builder
-> 2129 builder_instance = load_dataset_builder(
2130 path=path,
2131 name=name,
2132 data_dir=data_dir,
2133 data_files=data_files,
2134 cache_dir=cache_dir,
2135 features=features,
2136 download_config=download_config,
2137 download_mode=download_mode,
2138 revision=revision,
2139 token=token,
2140 storage_options=storage_options,
2141 trust_remote_code=trust_remote_code,
2142 _require_default_config_name=name is None,
2143 **config_kwargs,
2144 )
2146 # Return iterable dataset in case of streaming
2147 if streaming:

File /opt/conda/envs/audio/lib/python3.10/site-packages/datasets/load.py:1886, in load_dataset_builder(path, name, data_dir, data_files, cache_dir, features, download_config, download_mode, revision, token, storage_options, trust_remote_code, _require_default_config_name, **config_kwargs)
1884 builder_cls = get_dataset_builder_class(dataset_module, dataset_name=dataset_name)
1885 # Instantiate the dataset builder
-> 1886 builder_instance: DatasetBuilder = builder_cls(
1887 cache_dir=cache_dir,
1888 dataset_name=dataset_name,
1889 config_name=config_name,
1890 data_dir=data_dir,
1891 data_files=data_files,
1892 hash=dataset_module.hash,
1893 info=info,
1894 features=features,
1895 token=token,
1896 storage_options=storage_options,
1897 **builder_kwargs,
1898 **config_kwargs,
1899 )
1900 builder_instance._use_legacy_cache_dir_if_possible(dataset_module)
1902 return builder_instance

TypeError: 'NoneType' object is not callable

解决方案

  1. 最初使用的 datasets 版本是3.3.0, 通过版本降级,依次尝试了2.21.0, 2.16.0,都无法解决,最后尝试了2.10.0成功解决(如果使用的Notebook, 版本降级后记得 Restart)
1
2
3
4
5
# 查看 datasets 版本
pip show datasets

# 版本降级到2.10.0
pip install datasets==2.10.0
  1. 降级后的版本
1
2
3
4
5
6
7
8
9
10
Name: datasets
Version: 2.10.0
Summary: HuggingFace community-driven open-source library of datasets
Home-page: https://github.com/huggingface/datasets
Author: HuggingFace Inc.
Author-email: thomas@huggingface.co
License: Apache 2.0
Location: /opt/conda/envs/audio/lib/python3.10/site-packages
Requires: aiohttp, dill, fsspec, huggingface-hub, multiprocess, numpy, packaging, pandas, pyarrow, pyyaml, requests, responses, tqdm, xxhash
Required-by: evaluate

TypeError 'NoneType' object is not callable
https://liberow.github.io/2025/02/20/error/NoneType/
Author
liberow
Posted on
February 20, 2025
Licensed under