I/O, NIO & Serialization
Every service reads config, logs to disk, streams uploads, and talks over sockets. Java offers three layers: classic java.io streams, java.nio channels and buffers for scalable I/O, and java.nio.file (NIO.2) for modern filesystem work. This chapter covers each layer, when to pick which, and why native Java serialization is a security footgun in production.
Three layers of Java I/O
Pick the API that matches your abstraction: byte/char streams for simple pipelines,
channels and buffers when you need non-blocking or memory-mapped I/O,
Path/Files for filesystem operations without legacy File.
| API | Package | Best for |
|---|---|---|
| Classic I/O | java.io | Teaching, legacy libs, simple read/write, decorators |
| NIO | java.nio | Sockets, high concurrency, direct buffers, selectors |
| NIO.2 | java.nio.file | Paths, copy/move, directory walks, watch folders |
Modern application code often combines NIO.2 for files with HTTP/gRPC clients for network I/O—raw socket code is rarer but still appears in agents, games, and custom protocols.
Prefer try-with-resources on every Closeable—streams, readers, channels, and Stream<Path> from Files.walk must be closed to avoid descriptor leaks.
Classic I/O (java.io)
Stream-oriented I/O: read/write one byte or char at a time (or in chunks via decorators). Everything is blocking unless you layer NIO underneath.
Stream hierarchy — bytes vs characters
Byte streams (InputStream / OutputStream) move raw 8-bit data—files, sockets, images, compressed bytes.
Character streams (Reader / Writer) handle 16-bit UTF-16 units and apply charset encoding/decoding at the boundary.
Use byte streams for binary payloads; use character streams for text when you know the charset (always specify StandardCharsets.UTF_8—never rely on platform default).
InputStream ──► FileInputStream, Socket.getInputStream(), FilterInputStream
└── BufferedInputStream, DataInputStream, ObjectInputStream
Reader ──► InputStreamReader (bytes → chars), FileReader*, BufferedReader
* FileReader uses default charset — prefer Files.newBufferedReader(path, UTF_8)
Bridge with InputStreamReader / OutputStreamWriter when a library only exposes byte streams but you need text.
Decorators stack on streams—see Design Patterns: Decorator for the I/O pipeline mental model.
BufferedReader and BufferedWriter — why buffering matters
Each unbuffered read() can syscall into the OS—expensive for line-oriented text or many small writes.
BufferedReader fills an internal char array (default 8K) and serves reads from memory; readLine() scans for \n without per-character syscalls.
Rule of thumb: wrap any bare FileReader or InputStreamReader in BufferedReader unless you read large known-size blocks yourself.
try (var reader = new BufferedReader(
new InputStreamReader(Files.newInputStream(path), StandardCharsets.UTF_8))) {
String line;
while ((line = reader.readLine()) != null) {
process(line);
}
}
FileInputStream, FileOutputStream, FileReader, FileWriter
FileInputStream / FileOutputStream read/write bytes to a path string or File.
FileReader / FileWriter are convenience wrappers using the JVM default charset—brittle across environments.
Prefer NIO.2: Files.newInputStream(path), Files.newBufferedReader(path, UTF_8)—clearer errors (NoSuchFileException) and no legacy File API.
// Classic byte copy
try (var in = new FileInputStream("in.bin");
var out = new FileOutputStream("out.bin")) {
in.transferTo(out); // Java 9+ on InputStream
}
// NIO.2 — preferred for files
Files.copy(Path.of("in.bin"), Path.of("out.bin"), StandardCopyOption.REPLACE_EXISTING);
PrintWriter and Scanner
PrintWriter formats text output (print, println, printf) and can auto-flush on newline.
Does not throw IOException on write—check checkError() after important writes or use a writer that propagates exceptions.
Scanner tokenizes input (delimiters, regex, primitives). Convenient for stdin and small files;
for large files use BufferedReader lines or Files.lines to avoid hidden buffering overhead and locale parsing surprises.
try (var out = new PrintWriter(
Files.newBufferedWriter(logPath, StandardCharsets.UTF_8), StandardCharsets.UTF_8)) {
out.printf("%s %d%n", Instant.now(), count);
}
try (var scan = new Scanner(path, StandardCharsets.UTF_8)) {
while (scan.hasNextLine()) {
String line = scan.nextLine();
}
}
Forgetting to close streams leaks file descriptors—on Linux you may hit “Too many open files” under load. Always use try-with-resources (see Exceptions for suppressed exception behavior).
FileReader and FileWriter use Charset.defaultCharset()—often UTF-8 on modern Linux/macOS but not guaranteed on all Windows deployments. Explicit UTF-8 avoids mojibake when config moves between machines.
NIO (java.nio) — Java 4+
Buffer-oriented I/O: data lives in ByteBuffer objects, transferred via Channel implementations.
Selector multiplexes many channels on one thread—foundation for non-blocking servers (and Netty-style frameworks).
Channels, buffers, selectors
| Type | Role | Examples |
|---|---|---|
Buffer | Contiguous memory for read/write | ByteBuffer, CharBuffer |
Channel | Connection to I/O source/sink | FileChannel, SocketChannel, ServerSocketChannel |
Selector | Ready-set for many channels | select() until read/write/connect ready |
FileChannel supports zero-copy transferTo/transferFrom and memory-mapped files (map) for large read-mostly data.
try (FileChannel in = FileChannel.open(src, StandardOpenOption.READ);
FileChannel out = FileChannel.open(dst,
StandardOpenOption.WRITE, StandardOpenOption.CREATE, StandardOpenOption.TRUNCATE_EXISTING)) {
long pos = 0;
long size = in.size();
while (pos < size) {
pos += in.transferTo(pos, size - pos, out); // OS may zero-copy
}
}
ByteBuffer — allocate, direct, flip, clear
ByteBuffer.allocate(n) creates a heap buffer—GC-managed, fine for most apps.
allocateDirect(n) allocates off-heap memory—higher setup cost, better for long-lived buffers copied by JNI/OS (socket pipelines), must be monitored for native memory pressure.
Buffer modes use four pointers: capacity, position, limit, mark.
- flip() — after writing: limit = position, position = 0 → ready to read what you wrote
- clear() — position = 0, limit = capacity → prepare for new write (does not zero bytes)
- rewind() — position = 0, limit unchanged → reread without writing again
- compact() — copy unread bytes to start, position at end of copied data → partial consume then continue writing
ByteBuffer buf = ByteBuffer.allocate(256);
buf.put("hello".getBytes(StandardCharsets.US_ASCII));
buf.flip(); // switch to read mode
while (buf.hasRemaining()) {
System.out.print((char) buf.get());
}
buf.clear(); // reuse for next write
Non-blocking I/O with Selector — multiplexed server
Set channels non-blocking (configureBlocking(false)), register with a Selector for OP_ACCEPT, OP_READ, OP_WRITE.
One thread calls selector.select() and processes only ready channels—avoids one thread per connection.
Production frameworks (Netty, Jetty internals) extend this model; understand the primitives before using them.
Selector selector = Selector.open();
ServerSocketChannel server = ServerSocketChannel.open();
server.bind(new InetSocketAddress(8080));
server.configureBlocking(false);
server.register(selector, SelectionKey.OP_ACCEPT);
ByteBuffer buf = ByteBuffer.allocate(1024);
while (true) {
selector.select(); // blocks until events
for (SelectionKey key : selector.selectedKeys()) {
selector.selectedKeys().remove(key);
if (key.isAcceptable()) {
SocketChannel client = server.accept();
client.configureBlocking(false);
client.register(selector, SelectionKey.OP_READ);
} else if (key.isReadable()) {
SocketChannel ch = (SocketChannel) key.channel();
buf.clear();
int n = ch.read(buf);
if (n == -1) { ch.close(); key.cancel(); continue; }
buf.flip();
ch.write(buf); // echo demo
}
}
}
Virtual threads (Java 21) often use blocking I/O with massive concurrency—simpler than selectors for many HTTP services. Selectors still matter for custom protocols, proxies, and embedded high-connection-count servers.
NIO.2 (java.nio.file) — Java 7+
Path-centric filesystem API: immutable Path objects, static helpers on Files,
rich exceptions, and optional atomic moves—replaces most java.io.File usage.
Path API
Path.of("a", "b", "c") (Java 11+) and Paths.get(...) build platform paths.
resolve joins a relative segment; resolveSibling replaces the file name.
relativize computes a relative path between two paths (throws if different roots on Windows).
normalize removes . and ..; toAbsolutePath and toRealPath (follow symlinks) matter for security checks.
Path base = Path.of("/var/app");
Path config = base.resolve("config").resolve("app.yml");
Path logs = base.resolveSibling("logs"); // /var/logs if base is /var/app
Path a = Path.of("/var/app/data");
Path b = Path.of("/var/app/logs/out.log");
Path rel = a.relativize(b); // ../logs/out.log
Files utility — major operations
Static methods on Files throw checked IOException subclasses—handle or declare. Grouped by task:
| Category | Methods | Notes |
|---|---|---|
| Existence / type | exists, notExists, isDirectory, isRegularFile, isSymbolicLink | Use LinkOption.NOFOLLOW_LINKS to not follow symlinks |
| Create | createFile, createDirectory, createDirectories | createDirectories creates parents |
| Copy / move | copy, move | COPY_ATTRIBUTES, REPLACE_EXISTING, ATOMIC_MOVE |
| Delete | delete, deleteIfExists | delete fails if missing |
| Read / write | readAllBytes, readAllLines, readString, write, writeString | Small files only—loads entire file |
| Streams | lines, newInputStream, newOutputStream, newBufferedReader, newBufferedWriter | Close streams; lines returns Stream |
| Metadata | size, getLastModifiedTime, setAttribute, readAttributes | POSIX attributes on supported FS |
| Probe | probeContentType | Guess MIME from extension magic |
| Links | createSymbolicLink, createLink | Hard vs symbolic links |
| Permissions | isReadable, isWritable, isExecutable | Also setPosixFilePermissions |
Files.list(dir) returns only immediate children (one level); walk recurses. Both return streams—close them. For huge directories, walk with maxDepth or a custom FileVisitor avoids loading the full tree into memory.
try (Stream<Path> entries = Files.list(Path.of("data/inbox"))) {
entries.filter(Files::isRegularFile)
.forEach(this::ingest);
}
Path dir = Files.createDirectories(Path.of("data", "inbox"));
Path file = dir.resolve("note.txt");
Files.writeString(file, "hello\n", StandardCharsets.UTF_8,
StandardOpenOption.CREATE, StandardOpenOption.TRUNCATE_EXISTING);
String text = Files.readString(file, StandardCharsets.UTF_8);
Files.copy(file, dir.resolve("backup.txt"), StandardCopyOption.REPLACE_EXISTING);
Files.move(file, dir.resolve("archive/note.txt"), StandardCopyOption.ATOMIC_MOVE);
long bytes = Files.size(dir.resolve("archive/note.txt"));
String mime = Files.probeContentType(dir.resolve("archive/note.txt"));
Walking directory trees
Files.walk(root) returns a Stream<Path> depth-first—must close the stream.
maxDepth overload limits recursion. FileVisitOption.FOLLOW_LINKS follows symlinks (cycle risk).
Files.walkFileTree(root, visitor) invokes FileVisitor callbacks:
preVisitDirectory, visitFile, visitFileFailed, postVisitDirectory—return CONTINUE, SKIP_SUBTREE, TERMINATE, or SKIP_SIBLINGS.
Use for deletes, size totals, or custom indexing with control over failure handling.
// Stream — sum sizes of .log files
try (Stream<Path> paths = Files.walk(Path.of("/var/log"), 3)) {
long total = paths
.filter(Files::isRegularFile)
.filter(p -> p.toString().endsWith(".log"))
.mapToLong(p -> {
try { return Files.size(p); } catch (IOException e) { return 0L; }
})
.sum();
}
// FileVisitor — delete tree after processing
Files.walkFileTree(Path.of("tmp/work"), new SimpleFileVisitor<>() {
@Override
public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IOException {
Files.delete(file);
return FileVisitResult.CONTINUE;
}
@Override
public FileVisitResult postVisitDirectory(Path dir, IOException exc) throws IOException {
if (exc != null) throw exc;
Files.delete(dir);
return FileVisitResult.CONTINUE;
}
});
WatchService — filesystem change events
Register a Path directory with a WatchService for ENTRY_CREATE, ENTRY_MODIFY, ENTRY_DELETE.
take() or poll() return batches of WatchKey events—reset the key to receive more.
Behavior is OS-dependent (coalescing, missing events on fast churn)—use for dev reload and cache invalidation, not as sole source of truth.
WatchService watcher = FileSystems.getDefault().newWatchService();
Path dir = Path.of("config");
dir.register(watcher, StandardWatchEventKinds.ENTRY_MODIFY);
while (true) {
WatchKey key = watcher.take();
for (WatchEvent<?> event : key.pollEvents()) {
Path changed = dir.resolve((Path) event.context());
reloadConfig(changed);
}
key.reset();
}
Glob patterns and PathMatcher
Syntax is glob: (default on default FS) or regex: for regular expressions.
FileSystem.getPathMatcher("glob:**/*.java") matches paths—** crosses directories.
PathMatcher matcher = FileSystems.getDefault().getPathMatcher("glob:**/*.{java,kt}");
try (Stream<Path> walk = Files.walk(Path.of("src"))) {
walk.filter(Files::isRegularFile)
.filter(matcher::matches)
.forEach(System.out::println);
}
User-supplied paths must be validated: resolve against a trusted base directory and call startsWith(base) after normalize to block path traversal (../../etc/passwd).
Java serialization
Native binary serialization via ObjectOutputStream / ObjectInputStream encodes object graphs with class metadata.
Convenient for RMI-era persistence—discouraged for new systems due to fragility and exploit history.
Serializable and object streams
Implement java.io.Serializable (marker interface) to opt in. Non-transient fields are written recursively;
static fields are not serialized. Deserialization reconstructs objects without running normal constructors unless you customize hooks.
record UserSession(String userId, Instant loginAt) implements Serializable {
@Serial private static final long serialVersionUID = 1L;
}
try (var out = new ObjectOutputStream(Files.newOutputStream(path))) {
out.writeObject(session);
}
try (var in = new ObjectInputStream(Files.newInputStream(path))) {
UserSession restored = (UserSession) in.readObject();
}
serialVersionUID — define it explicitly
Each serializable class has a version fingerprint. If you change fields without updating the UID, deserialization throws InvalidClassException.
The compiler can generate serialVersionUID from class shape—any compatible-looking change may shift it silently.
Declare private static final long serialVersionUID = 1L; (or a computed constant) intentionally when you evolve schemas.
For records and immutable DTOs, prefer JSON/Protobuf instead of evolving binary Java serialization.
transient keyword
Fields marked transient are skipped during default serialization—recompute on deserialize (e.g. caches, derived keys, secrets).
Do not store passwords or raw tokens in serialized blobs even as transient if custom writeObject reintroduces them.
Custom readObject / writeObject
Private methods writeObject(ObjectOutputStream) and readObject(ObjectInputStream) hook default serialization—validate invariants, encrypt fields, or migrate versions.
readObjectNoData handles missing stream data for optional fields. Implement Externalizable for full manual control (rare).
class Account implements Serializable {
@Serial private static final long serialVersionUID = 2L;
private final String id;
private transient String displayCache;
@Serial
private void writeObject(ObjectOutputStream out) throws IOException {
out.defaultWriteObject();
// omit or encrypt sensitive derived state
}
@Serial
private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException {
in.defaultReadObject();
displayCache = "acct:" + id; // rebuild transient
}
}
Serialization vulnerabilities and alternatives
Untrusted ObjectInputStream.readObject() can instantiate attacker-chosen classes (“gadget chains”) leading to remote code execution—
a recurring CVE theme in app servers and libraries. Treat Java deserialization like executing untrusted code.
Attack model: malicious bytes reference classes on the classpath whose methods chain together (readObject in commons-collections, Spring, etc.). Defenses layer:
- Do not deserialize untrusted data — best fix
- Allowlists —
ObjectInputFilter(Java 9+) on stream or JVM-wide - Isolate — separate classloader with minimal classpath
- Replace format — schema-first codecs without arbitrary class instantiation
ObjectInputFilter filter = ObjectInputFilter.Config.createFilter(
"com.myapp.**;java.base.*;!*");
var in = new ObjectInputStream(inputStream);
in.setObjectInputFilter(filter);
| Format | Strengths | Typical use |
|---|---|---|
| JSON (Jackson, Gson) | Human-readable, web APIs | REST, config; bind to DTOs only |
| Protobuf / gRPC | Compact, schema evolution, fast | Microservices, high throughput |
| Avro | Schema registry, compact binary | Kafka pipelines, data lakes |
| Java Serialization | Graph preservation, RMI legacy | Avoid for new external boundaries |
Spring and Hibernate use Java serialization only in narrow places (HTTP session replication, some caches)—prefer JSON or dedicated stores for session state in new designs.
JSON deserialization is not automatically safe either—enable default typing in Jackson or polymorphic gadgets can revive similar issues. Bind to explicit DTO types, disable dangerous features, and validate input size.
Explain byte vs char streams, when NIO.2 replaces File, and buffer flip() vs clear(). For serialization: state why serialVersionUID matters and why deserializing untrusted input is dangerous—name JSON/Protobuf as safer alternatives.