JAVA连接HDFS使用案例
一、引言
Hadoop分布式文件系统(HDFS)是大数据存储的基础。对于Java开发者来说,能够通过Java代码操作HDFS是处理大数据任务的关键技能。本文将通过几个简单的示例,展示如何使用Java连接HDFS并执行一些基本的文件操作。
二、连接HDFS
1、第一步:添加依赖
在Maven项目中,需要添加Hadoop客户端依赖:
<dependency><groupId>org.apache.hadoop</groupId><artifactId>hadoop-client</artifactId><version>2.8.2</version>
</dependency>
2、第二步:创建连接
创建HDFS连接的基本代码如下:
Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(new URI("hdfs://localhost:9000"), conf);
三、操作HDFS
1、检查文件是否存在
public static boolean test(Configuration conf, String path) {try (FileSystem fs = FileSystem.get(conf)) {return fs.exists(new Path(path));} catch (IOException e) {e.printStackTrace();return false;}
}
2、在文件开头或末尾插入内容
public static void insertContent(String filePath, String content, boolean insertAtBeginning) throws URISyntaxException {try {Configuration conf = new Configuration();FileSystem fs = FileSystem.get(new URI("hdfs://localhost:9000"), conf);Path path = new Path(filePath);if (fs.exists(path)) {BufferedReader reader = new BufferedReader(new InputStreamReader(fs.open(path)));StringBuilder originalContent = new StringBuilder();String line;while ((line = reader.readLine()) != null) {originalContent.append(line).append("\n");}reader.close();FSDataOutputStream outputStream = fs.create(path, true);if (insertAtBeginning) {outputStream.write(content.getBytes());outputStream.write(originalContent.toString().getBytes());} else {outputStream.write(originalContent.toString().getBytes());outputStream.write(content.getBytes());}outputStream.close();} else {FSDataOutputStream outputStream = fs.create(path);outputStream.write(content.getBytes());outputStream.close();}} catch (IOException e) {e.printStackTrace();}
}
3、删除文件
public static void main(String[] args) throws IOException, URISyntaxException {Configuration conf = new Configuration();FileSystem fs = FileSystem.get(new URI("hdfs://localhost:9000"), conf);String name = "hdfs://localhost:9000/a/a.txt";boolean res = fs.delete(new Path(name), false);if (res) {System.out.println("File deleted successfully.");} else {System.out.println("File deletion failed.");}
}
4、移动文件
public static void moveFile(String sourcePath, String destinationPath) {try {Configuration conf = new Configuration();FileSystem fs = FileSystem.get(new URI("hdfs://localhost:9000"), conf);Path src = new Path(sourcePath);Path dst = new Path(destinationPath);if (!fs.exists(src)) {System.out.println("Source file does not exist.");return;}if (!fs.exists(dst.getParent())) {fs.mkdirs(dst.getParent());}boolean success = fs.rename(src, dst);if (success) {System.out.println("File moved successfully.");} else {System.out.println("File move failed.");}} catch (Exception e) {e.printStackTrace();}
}
四、总结
通过上述示例,我们可以看到Java连接HDFS并执行基本文件操作的过程相对直接。这些操作包括检查文件是否存在、在文件开头或末尾插入内容、删除文件以及移动文件。掌握这些基本操作对于处理大数据任务至关重要。
版权声明:本博客内容为原创,转载请保留原文链接及作者信息。